Ar-lsat: Investigating analytical reasoning of text

C Helwe, C Clavel, F Suchanek - International Conference on …, 2021 - imt.hal.science

Recent years have seen impressive performance of transformer-based models on different
natural language processing tasks. However, it is not clear to what degree the transformers …

被引用次数：61 相关文章所有 13 个版本

[PDF] arxiv.org

Holistic evaluation of language models

P Liang, R Bommasani, T Lee, D Tsipras… - arXiv preprint arXiv …, 2022 - arxiv.org

Language models (LMs) are becoming the foundation for almost all major language
technologies, but their capabilities, limitations, and risks are not well understood. We present …

被引用次数：934 相关文章所有 5 个版本

[PDF] nature.com

Foundation metrics for evaluating effectiveness of healthcare conversations powered by generative AI

M Abbasian, E Khatibi, I Azimi, D Oniani… - NPJ Digital …, 2024 - nature.com

Abstract Generative Artificial Intelligence is set to revolutionize healthcare delivery by
transforming traditional patient care into a more personalized, efficient, and proactive …

被引用次数：20 相关文章所有 7 个版本

[PDF] neurips.cc

Boardgameqa: A dataset for natural language reasoning with contradictory information

M Kazemi, Q Yuan, D Bhatia, N Kim… - Advances in …, 2024 - proceedings.neurips.cc

Automated reasoning with unstructured natural text is a key requirement for many potential
applications of NLP and for developing robust AI systems. Recently, Language Models …

被引用次数：15 相关文章所有 6 个版本

[PDF] aclanthology.org

Out-of-distribution generalization in natural language processing: Past, present, and future

L Yang, Y Song, X Ren, C Lyu, Y Wang… - Proceedings of the …, 2023 - aclanthology.org

Abstract Machine learning (ML) systems in natural language processing (NLP) face
significant challenges in generalizing to out-of-distribution (OOD) data, where the test …

被引用次数：7 相关文章所有 3 个版本

[PDF] arxiv.org

From lsat: The progress and challenges of complex reasoning

S Wang, Z Liu, W Zhong, M Zhou, Z Wei… - … on Audio, Speech …, 2022 - ieeexplore.ieee.org

Complex reasoning aims to draw a correct inference based on complex rules. As a hallmark
of human intelligence, it involves a degree of explicit reading comprehension, interpretation …

被引用次数：38 相关文章所有 4 个版本

[PDF] arxiv.org

Geomverse: A systematic evaluation of large models for geometric reasoning

M Kazemi, H Alvari, A Anand, J Wu, X Chen… - arXiv preprint arXiv …, 2023 - arxiv.org

Large language models have shown impressive results for multi-hop mathematical
reasoning when the input question is only textual. Many mathematical reasoning problems …

被引用次数：17 相关文章所有 4 个版本

[PDF] neurips.cc

Logigan: Learning logical reasoning via adversarial pre-training

X Pi, W Zhong, Y Gao, N Duan… - Advances in Neural …, 2022 - proceedings.neurips.cc

We present LogiGAN, an unsupervised adversarial pre-training framework for improving
logical reasoning abilities of language models. Upon automatic identification of logical …

被引用次数：17 相关文章所有 7 个版本

[PDF] mdpi.com

Do Large Language Models Show Human-like Biases? Exploring Confidence—Competence Gap in AI

AK Singh, B Lamichhane, S Devkota, U Dhakal… - Information, 2024 - mdpi.com

This study investigates self-assessment tendencies in Large Language Models (LLMs),
examining if patterns resemble human cognitive biases like the Dunning–Kruger effect …

被引用次数：5 相关文章所有 3 个版本

[PDF] aclanthology.org

LogiTorch: A PyTorch-based library for logical reasoning on natural language

C Helwe, C Clavel, F Suchanek - Proceedings of the 2022 …, 2022 - aclanthology.org

Logical reasoning on natural language is one of the most challenging tasks for deep
learning models. There has been an increasing interest in developing new benchmarks to …

被引用次数：8 相关文章所有 25 个版本