Large Language Models (LLMs) showcase impressive capabilities but encounter challenges like hallucination, outdated knowledge, and non-transparent, untraceable reasoning processes. Additionally, this paper introduces up-to-date evaluation framework and benchmark.
This research advances how AI systems learn, reason, and solve problems — with direct implications for automation and scientific discovery.
Read the full paper
Access the original peer-reviewed research via OpenAlex.
| Category | 🤖 Artificial Intelligence |
| Published | Dec 18, 2023 |
| Journal | arXiv (Cornell University) |
| Authors | Yunfan Gao, Yun Xiong, Xinyu Gao, Kangxiang Jia, Jinliu Pan |
| DOI | 10.48550/arxiv.2312.10997 |
| Citations | 648 |
| Source | OpenAlex |