LUKE: Deep contextualized entity representations with entity-aware self-attention
Entity representations are useful in natural language tasks involving entities. In this paper,
we propose new pretrained contextualized representations of words and entities based on …
we propose new pretrained contextualized representations of words and entities based on …
It's not just size that matters: Small language models are also few-shot learners
When scaled to hundreds of billions of parameters, pretrained language models such as
GPT-3 (Brown et al., 2020) achieve remarkable few-shot performance. However, enormous …
GPT-3 (Brown et al., 2020) achieve remarkable few-shot performance. However, enormous …
Domain-specific language model pretraining for biomedical natural language processing
Pretraining large neural language models, such as BERT, has led to impressive gains on
many natural language processing (NLP) tasks. However, most pretraining efforts focus on …
many natural language processing (NLP) tasks. However, most pretraining efforts focus on …
M3exam: A multilingual, multimodal, multilevel benchmark for examining large language models
Despite the existence of various benchmarks for evaluating natural language processing
models, we argue that human exams are a more suitable means of evaluating general …
models, we argue that human exams are a more suitable means of evaluating general …
Pre-trained language models for text generation: A survey
Text Generation aims to produce plausible and readable text in human language from input
data. The resurgence of deep learning has greatly advanced this field, in particular, with the …
data. The resurgence of deep learning has greatly advanced this field, in particular, with the …
Language models of code are few-shot commonsense learners
We address the general task of structured commonsense reasoning: given a natural
language input, the goal is to generate a graph such as an event--or a reasoning-graph. To …
language input, the goal is to generate a graph such as an event--or a reasoning-graph. To …
Spot: Better frozen model adaptation through soft prompt transfer
There has been growing interest in parameter-efficient methods to apply pre-trained
language models to downstream tasks. Building on the Prompt Tuning approach of Lester et …
language models to downstream tasks. Building on the Prompt Tuning approach of Lester et …
Dynabench: Rethinking benchmarking in NLP
We introduce Dynabench, an open-source platform for dynamic dataset creation and model
benchmarking. Dynabench runs in a web browser and supports human-and-model-in-the …
benchmarking. Dynabench runs in a web browser and supports human-and-model-in-the …
Benchmarking foundation models with language-model-as-an-examiner
Numerous benchmarks have been established to assess the performance of foundation
models on open-ended question answering, which serves as a comprehensive test of a …
models on open-ended question answering, which serves as a comprehensive test of a …
Deberta: Decoding-enhanced bert with disentangled attention
Recent progress in pre-trained neural language models has significantly improved the
performance of many natural language processing (NLP) tasks. In this paper we propose a …
performance of many natural language processing (NLP) tasks. In this paper we propose a …