Natural language reasoning, a survey
This survey paper proposes a clearer view of natural language reasoning in the field of
Natural Language Processing (NLP), both conceptually and practically. Conceptually, we …
Natural Language Processing (NLP), both conceptually and practically. Conceptually, we …
Evaluating large language models: A comprehensive survey
Large language models (LLMs) have demonstrated remarkable capabilities across a broad
spectrum of tasks. They have attracted significant attention and been deployed in numerous …
spectrum of tasks. They have attracted significant attention and been deployed in numerous …
Evaluating the logical reasoning ability of chatgpt and gpt-4
Harnessing logical reasoning ability is a comprehensive natural language understanding
endeavor. With the release of Generative Pretrained Transformer 4 (GPT-4), highlighted as" …
endeavor. With the release of Generative Pretrained Transformer 4 (GPT-4), highlighted as" …
Dynabench: Rethinking benchmarking in NLP
We introduce Dynabench, an open-source platform for dynamic dataset creation and model
benchmarking. Dynabench runs in a web browser and supports human-and-model-in-the …
benchmarking. Dynabench runs in a web browser and supports human-and-model-in-the …
Robustness gym: Unifying the NLP evaluation landscape
Despite impressive performance on standard benchmarks, deep neural networks are often
brittle when deployed in real-world systems. Consequently, recent research has focused on …
brittle when deployed in real-world systems. Consequently, recent research has focused on …
Logiqa 2.0—an improved dataset for logical reasoning in natural language understanding
NLP research on logical reasoning regains momentum with the recent releases of a handful
of datasets, notably LogiQA and Reclor. Logical reasoning is exploited in many probing …
of datasets, notably LogiQA and Reclor. Logical reasoning is exploited in many probing …
Towards faithful model explanation in nlp: A survey
End-to-end neural Natural Language Processing (NLP) models are notoriously difficult to
understand. This has given rise to numerous efforts towards model explainability in recent …
understand. This has given rise to numerous efforts towards model explainability in recent …
Recursion in recursion: Two-level nested recursion for length generalization with scalability
J Ray Chowdhury, C Caragea - Advances in Neural …, 2024 - proceedings.neurips.cc
Abstract Binary Balanced Tree Recursive Neural Networks (BBT-RvNNs) enforce sequence
composition according to a preset balanced binary tree structure. Thus, their non-linear …
composition according to a preset balanced binary tree structure. Thus, their non-linear …
ANLIzing the adversarial natural language inference dataset
We perform an in-depth error analysis of Adversarial NLI (ANLI), a recently introduced large-
scale human-and-model-in-the-loop natural language inference dataset collected over …
scale human-and-model-in-the-loop natural language inference dataset collected over …
When llms meet cunning questions: A fallacy understanding benchmark for large language models
Recently, Large Language Models (LLMs) have made remarkable evolutions in language
understanding and generation. Following this, various benchmarks for measuring all kinds …
understanding and generation. Following this, various benchmarks for measuring all kinds …