TY - GEN
T1 - Evaluation of the Quality of AI-Generated Scientific Text Under Different Types of Cognitive Complexity Tasks
AU - Peng, Hui
AU - Liu, Shujun
AU - Li, Lei
N1 - Publisher Copyright:
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2025.
PY - 2025
Y1 - 2025
N2 - As Artificial Intelligence Generated Content (AIGC) continues to deepen its application in the field of scientific research, this study aims to explore the current quality of AIGC in completing research tasks, providing insights for improving AIGC in the scientific research domain. This study first reviews and summarizes existing information quality evaluation frameworks and AIGC-related research to propose quality evaluation criteria for AIGC in the research context. Then, by setting research tasks with different cognitive complexities, user experiments were conducted on the ChatGPT and ERNIE Bot platforms to select appropriate AIGC quality evaluation criteria for these tasks. The quality of AIGC generated by ChatGPT and ERNIE Bot was evaluated based on the selected criteria, revealing the strengths and weaknesses of current AIGC in meeting users’ research information needs. The results show that users generally value relevance, professionalism, and readability when evaluating AIGC for research tasks. However, attention to specific criteria such as accuracy, diversity, coherence, and creativity varies depending on the cognitive complexity of the research tasks. Additionally, AIGC performs well in understanding, evaluating, and creating tasks but has significant shortcomings in remembering and analyzing tasks, particularly in terms of accuracy and professionalism.
AB - As Artificial Intelligence Generated Content (AIGC) continues to deepen its application in the field of scientific research, this study aims to explore the current quality of AIGC in completing research tasks, providing insights for improving AIGC in the scientific research domain. This study first reviews and summarizes existing information quality evaluation frameworks and AIGC-related research to propose quality evaluation criteria for AIGC in the research context. Then, by setting research tasks with different cognitive complexities, user experiments were conducted on the ChatGPT and ERNIE Bot platforms to select appropriate AIGC quality evaluation criteria for these tasks. The quality of AIGC generated by ChatGPT and ERNIE Bot was evaluated based on the selected criteria, revealing the strengths and weaknesses of current AIGC in meeting users’ research information needs. The results show that users generally value relevance, professionalism, and readability when evaluating AIGC for research tasks. However, attention to specific criteria such as accuracy, diversity, coherence, and creativity varies depending on the cognitive complexity of the research tasks. Additionally, AIGC performs well in understanding, evaluating, and creating tasks but has significant shortcomings in remembering and analyzing tasks, particularly in terms of accuracy and professionalism.
KW - AIGC Evaluation
KW - Cognitive Complexity
KW - Scientific Context
UR - http://www.scopus.com/inward/record.url?scp=85213043103&partnerID=8YFLogxK
U2 - 10.1007/978-981-96-0865-2_17
DO - 10.1007/978-981-96-0865-2_17
M3 - 会议稿件
AN - SCOPUS:85213043103
SN - 9789819608645
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 212
EP - 221
BT - Sustainability and Empowerment in the Context of Digital Libraries - 26th International Conference on Asia-Pacific Digital Libraries, ICADL 2024, Proceedings
A2 - Oliver, Gillian
A2 - Frings-Hessami, Viviane
A2 - Du, Jia Tina
A2 - Tezuka, Taro
PB - Springer Science and Business Media Deutschland GmbH
T2 - 26th International Conference on Asia-Pacific Digital Libraries, ICADL 2024
Y2 - 4 December 2024 through 6 December 2024
ER -