Do NLP models know numbers? probing numeracy in embeddings

Pre-train, prompt, and predict: A systematic survey of prompting methods in natural language processing

P Liu, W Yuan, J Fu, Z Jiang, H Hayashi… - ACM Computing …, 2023 - dl.acm.org

This article surveys and organizes research works in a new paradigm in natural language
processing, which we dub “prompt-based learning.” Unlike traditional supervised learning …

被引用次数：4722 相关文章所有 4 个版本

[PDF] arxiv.org

A survey of deep learning for mathematical reasoning

P Lu, L Qiu, W Yu, S Welleck, KW Chang - arXiv preprint arXiv:2212.10535, 2022 - arxiv.org

Mathematical reasoning is a fundamental aspect of human intelligence and is applicable in
various fields, including science, engineering, finance, and everyday life. The development …

被引用次数：119 相关文章所有 6 个版本

[HTML] sciencedirect.com

[HTML][HTML] Pre-trained models: Past, present and future

X Han, Z Zhang, N Ding, Y Gu, X Liu, Y Huo, J Qiu… - AI Open, 2021 - Elsevier

Large-scale pre-trained models (PTMs) such as BERT and GPT have recently achieved
great success and become a milestone in the field of artificial intelligence (AI). Owing to …

被引用次数：863 相关文章所有 9 个版本

[PDF] neurips.cc

How does GPT-2 compute greater-than?: Interpreting mathematical abilities in a pre-trained language model

M Hanna, O Liu, A Variengien - Advances in Neural …, 2024 - proceedings.neurips.cc

Pre-trained language models can be surprisingly adept at tasks they were not explicitly
trained on, but how they implement these capabilities is poorly understood. In this paper, we …

被引用次数：93 相关文章所有 5 个版本

[PDF] arxiv.org

Impact of pretraining term frequencies on few-shot reasoning

Y Razeghi, RL Logan IV, M Gardner… - arXiv preprint arXiv …, 2022 - arxiv.org

Pretrained Language Models (LMs) have demonstrated ability to perform numerical
reasoning by extrapolating from a few examples in few-shot settings. However, the extent to …

被引用次数：234 相关文章所有 7 个版本

[PDF] mit.edu

A primer in BERTology: What we know about how BERT works

A Rogers, O Kovaleva, A Rumshisky - Transactions of the Association …, 2021 - direct.mit.edu

Transformer-based models have pushed state of the art in many areas of NLP, but our
understanding of what is behind their success is still limited. This paper is the first survey of …

被引用次数：1764 相关文章所有 12 个版本

[PDF] mit.edu

How can we know what language models know?

Z Jiang, FF Xu, J Araki, G Neubig - Transactions of the Association for …, 2020 - direct.mit.edu

Recent work has presented intriguing results examining the knowledge contained in
language models (LMs) by having the LM fill in the blanks of prompts such as “Obama is a …

被引用次数：1407 相关文章所有 17 个版本

[PDF] mit.edu

How Can We Know When Language Models Know? On the Calibration of Language Models for Question Answering

Z Jiang, J Araki, H Ding, G Neubig - Transactions of the Association …, 2021 - direct.mit.edu

Recent works have shown that language models (LM) capture different types of knowledge
regarding facts or common sense. However, because no model is perfect, they still fail to …

被引用次数：346 相关文章所有 12 个版本

[PDF] neurips.cc

Lift: Language-interfaced fine-tuning for non-language machine learning tasks

T Dinh, Y Zeng, R Zhang, Z Lin… - Advances in …, 2022 - proceedings.neurips.cc

Fine-tuning pretrained language models (LMs) without making any architectural changes
has become a norm for learning various language downstream tasks. However, for non …

被引用次数：125 相关文章所有 8 个版本

[PDF] mit.edu

Language model behavior: A comprehensive survey

TA Chang, BK Bergen - Computational Linguistics, 2024 - direct.mit.edu

Transformer language models have received widespread public attention, yet their
generated text is often surprising even to NLP researchers. In this survey, we discuss over …

被引用次数：84 相关文章所有 7 个版本