Language model behavior: A comprehensive survey
Transformer language models have received widespread public attention, yet their
generated text is often surprising even to NLP researchers. In this survey, we discuss over …
generated text is often surprising even to NLP researchers. In this survey, we discuss over …
Evaluating large language models at evaluating instruction following
As research in large language models (LLMs) continues to accelerate, LLM-based
evaluation has emerged as a scalable and cost-effective alternative to human evaluations …
evaluation has emerged as a scalable and cost-effective alternative to human evaluations …
Say what you mean! large language models speak too positively about negative commonsense knowledge
Large language models (LLMs) have been widely studied for their ability to store and utilize
positive knowledge. However, negative knowledge, such as" lions don't live in the ocean", is …
positive knowledge. However, negative knowledge, such as" lions don't live in the ocean", is …
Language models are not naysayers: an analysis of language models on negation benchmarks
Negation has been shown to be a major bottleneck for masked language models, such as
BERT. However, whether this finding still holds for larger-sized auto-regressive language …
BERT. However, whether this finding still holds for larger-sized auto-regressive language …
Natural language processing in marketing
J Hartmann, O Netzer - Artificial intelligence in marketing, 2023 - emerald.com
The increasing importance and proliferation of text data provide a unique opportunity and
novel lens to study human communication across a myriad of business and marketing …
novel lens to study human communication across a myriad of business and marketing …
ScoNe: Benchmarking negation reasoning in language models with fine-tuning and in-context learning
A number of recent benchmarks seek to assess how well models handle natural language
negation. However, these benchmarks lack the controlled example paradigms that would …
negation. However, these benchmarks lack the controlled example paradigms that would …
On the limitations of dataset balancing: The lost battle against spurious correlations
R Schwartz, G Stanovsky - arXiv preprint arXiv:2204.12708, 2022 - arxiv.org
Recent work has shown that deep learning models in NLP are highly sensitive to low-level
correlations between simple features and specific output labels, leading to overfitting and …
correlations between simple features and specific output labels, leading to overfitting and …
This is not a dataset: A large negation benchmark to challenge large language models
Although large language models (LLMs) have apparently acquired a certain level of
grammatical knowledge and the ability to make generalizations, they fail to interpret …
grammatical knowledge and the ability to make generalizations, they fail to interpret …
Exploring lottery prompts for pre-trained language models
Consistently scaling pre-trained language models (PLMs) imposes substantial burdens on
model adaptation, necessitating more efficient alternatives to conventional fine-tuning. Given …
model adaptation, necessitating more efficient alternatives to conventional fine-tuning. Given …
Not another negation benchmark: The NaN-NLI test suite for sub-clausal negation
Negation is poorly captured by current language models, although the extent of this problem
is not widely understood. We introduce a natural language inference (NLI) test suite to …
is not widely understood. We introduce a natural language inference (NLI) test suite to …