[引用][C] Reasoning with transformer-based models: Deep learning, but shallow reasoning
Recent years have seen impressive performance of transformer-based models on different
natural language processing tasks. However, it is not clear to what degree the transformers …
natural language processing tasks. However, it is not clear to what degree the transformers …
Holistic evaluation of language models
Language models (LMs) are becoming the foundation for almost all major language
technologies, but their capabilities, limitations, and risks are not well understood. We present …
technologies, but their capabilities, limitations, and risks are not well understood. We present …
Foundation metrics for evaluating effectiveness of healthcare conversations powered by generative AI
Abstract Generative Artificial Intelligence is set to revolutionize healthcare delivery by
transforming traditional patient care into a more personalized, efficient, and proactive …
transforming traditional patient care into a more personalized, efficient, and proactive …
Boardgameqa: A dataset for natural language reasoning with contradictory information
Automated reasoning with unstructured natural text is a key requirement for many potential
applications of NLP and for developing robust AI systems. Recently, Language Models …
applications of NLP and for developing robust AI systems. Recently, Language Models …
Out-of-distribution generalization in natural language processing: Past, present, and future
Abstract Machine learning (ML) systems in natural language processing (NLP) face
significant challenges in generalizing to out-of-distribution (OOD) data, where the test …
significant challenges in generalizing to out-of-distribution (OOD) data, where the test …
From lsat: The progress and challenges of complex reasoning
Complex reasoning aims to draw a correct inference based on complex rules. As a hallmark
of human intelligence, it involves a degree of explicit reading comprehension, interpretation …
of human intelligence, it involves a degree of explicit reading comprehension, interpretation …
Geomverse: A systematic evaluation of large models for geometric reasoning
Large language models have shown impressive results for multi-hop mathematical
reasoning when the input question is only textual. Many mathematical reasoning problems …
reasoning when the input question is only textual. Many mathematical reasoning problems …
Logigan: Learning logical reasoning via adversarial pre-training
We present LogiGAN, an unsupervised adversarial pre-training framework for improving
logical reasoning abilities of language models. Upon automatic identification of logical …
logical reasoning abilities of language models. Upon automatic identification of logical …
Do Large Language Models Show Human-like Biases? Exploring Confidence—Competence Gap in AI
This study investigates self-assessment tendencies in Large Language Models (LLMs),
examining if patterns resemble human cognitive biases like the Dunning–Kruger effect …
examining if patterns resemble human cognitive biases like the Dunning–Kruger effect …
LogiTorch: A PyTorch-based library for logical reasoning on natural language
Logical reasoning on natural language is one of the most challenging tasks for deep
learning models. There has been an increasing interest in developing new benchmarks to …
learning models. There has been an increasing interest in developing new benchmarks to …