Automated novelty evaluation of academic paper: A collaborative approach integrating human and large language model knowledge
Wu W. Zhang C. Zhao Y (2025). Automated novelty evaluation of academic paper: A collaborative approach integrating human and large language model knowledge. Journal of the Association for Information Science and Technology, https://doi.org/10.1002/asi.70005
- Overall rating
-
- Authors
- Wenqing Wu, Chengzhi Zhang, Yi Zhao
- Journal
- Journal of the Association for Information Science and Technology
- First published
- 2025
- Type
- Journal Article
- DOI
- 10.1002/asi.70005
Abstract
AbstractNovelty is a crucial criterion in the peer‐review process for evaluating academic papers. Traditionally, it is judged by experts or measured by unique reference combinations. Both methods have limitations: experts have limited knowledge, and the effectiveness of the combination method is uncertain. Moreover, it is unclear if unique citations truly measure novelty. The large language model (LLM) possesses a wealth of knowledge, while human experts possess judgment abilities that the LLM does not possess. Therefore, our research integrates the knowledge and abilities of LLM and human experts to address the limitations of novelty assessment. The most common novelty in academic papers is the introduction of new methods. In this paper, we propose leveraging human knowledge and LLM to assist pre‐trained language models (PLMs, e.g., BERT, etc.) in predicting the method novelty of papers. Specifically, we extract sentences related to the novelty of the academic paper from peer‐review reports and use LLM to summarize the methodology section of the academic paper, which are then used to fine‐tune PLMs. In addition, we have designed a text‐guided fusion module with novel Sparse‐Attention to better integrate human and LLM knowledge. We compared the method we proposed with a large number of baselines. Extensive experiments demonstrate that our method achieves superior performance.
Reviews
Informative Title
Methods
Statistical Analysis
Data Presentation
Discussion
Limitations
Data Available
I was initially excited about this paper because it helps me consider the potential of PLMs and LLMs in novelty evaluation, an understudied aspect of peer review. But as I read the paper, I encountered awkward wording, poor prompt engineering, and a focus on just methodological novelty. I sought a larger, more comprehensive study and the opportunity to utilize more reliable ground truth data than conference peer reviews. A bit more care in thinking through the study design would have been valuable. The authors, to their credit, were on point in identifying key limitations (data leakage should have been mentioned) and contrasted PLMs with LLMs. That sort of contrast is rare. I also appreciated their ablation study, though I would have liked them to conjecture broader implications for the design of future models from that experience. Future studies in this area would be welcome and more polished writing. This paper was repetitive and could have used the help of several copyeditors.