[引用][C] Reasoning with transformer-based models: Deep learning, but shallow reasoning

C Helwe, C Clavel, F Suchanek - International Conference on …, 2021 - imt.hal.science
Recent years have seen impressive performance of transformer-based models on different
natural language processing tasks. However, it is not clear to what degree the transformers …

Holistic evaluation of language models

P Liang, R Bommasani, T Lee, D Tsipras… - arXiv preprint arXiv …, 2022 - arxiv.org
Language models (LMs) are becoming the foundation for almost all major language
technologies, but their capabilities, limitations, and risks are not well understood. We present …

Foundation metrics for evaluating effectiveness of healthcare conversations powered by generative AI

M Abbasian, E Khatibi, I Azimi, D Oniani… - NPJ Digital …, 2024 - nature.com
Abstract Generative Artificial Intelligence is set to revolutionize healthcare delivery by
transforming traditional patient care into a more personalized, efficient, and proactive …

Boardgameqa: A dataset for natural language reasoning with contradictory information

M Kazemi, Q Yuan, D Bhatia, N Kim… - Advances in …, 2024 - proceedings.neurips.cc
Automated reasoning with unstructured natural text is a key requirement for many potential
applications of NLP and for developing robust AI systems. Recently, Language Models …

Out-of-distribution generalization in natural language processing: Past, present, and future

L Yang, Y Song, X Ren, C Lyu, Y Wang… - Proceedings of the …, 2023 - aclanthology.org
Abstract Machine learning (ML) systems in natural language processing (NLP) face
significant challenges in generalizing to out-of-distribution (OOD) data, where the test …

From lsat: The progress and challenges of complex reasoning

S Wang, Z Liu, W Zhong, M Zhou, Z Wei… - … on Audio, Speech …, 2022 - ieeexplore.ieee.org
Complex reasoning aims to draw a correct inference based on complex rules. As a hallmark
of human intelligence, it involves a degree of explicit reading comprehension, interpretation …

Geomverse: A systematic evaluation of large models for geometric reasoning

M Kazemi, H Alvari, A Anand, J Wu, X Chen… - arXiv preprint arXiv …, 2023 - arxiv.org
Large language models have shown impressive results for multi-hop mathematical
reasoning when the input question is only textual. Many mathematical reasoning problems …

Logigan: Learning logical reasoning via adversarial pre-training

X Pi, W Zhong, Y Gao, N Duan… - Advances in Neural …, 2022 - proceedings.neurips.cc
We present LogiGAN, an unsupervised adversarial pre-training framework for improving
logical reasoning abilities of language models. Upon automatic identification of logical …

Do Large Language Models Show Human-like Biases? Exploring Confidence—Competence Gap in AI

AK Singh, B Lamichhane, S Devkota, U Dhakal… - Information, 2024 - mdpi.com
This study investigates self-assessment tendencies in Large Language Models (LLMs),
examining if patterns resemble human cognitive biases like the Dunning–Kruger effect …

LogiTorch: A PyTorch-based library for logical reasoning on natural language

C Helwe, C Clavel, F Suchanek - Proceedings of the 2022 …, 2022 - aclanthology.org
Logical reasoning on natural language is one of the most challenging tasks for deep
learning models. There has been an increasing interest in developing new benchmarks to …