Generating training data with language models: Towards zero-shot language understanding
Pretrained language models (PLMs) have demonstrated remarkable performance in various
natural language processing tasks: Unidirectional PLMs (eg, GPT) are well known for their …
natural language processing tasks: Unidirectional PLMs (eg, GPT) are well known for their …
Knowledgeable prompt-tuning: Incorporating knowledge into prompt verbalizer for text classification
Tuning pre-trained language models (PLMs) with task-specific prompts has been a
promising approach for text classification. Particularly, previous studies suggest that prompt …
promising approach for text classification. Particularly, previous studies suggest that prompt …
Measuring coding challenge competence with apps
While programming is one of the most broadly applicable skills in modern society, modern
machine learning models still cannot code solutions to basic problems. Despite its …
machine learning models still cannot code solutions to basic problems. Despite its …
TimeLMs: Diachronic language models from Twitter
Despite its importance, the time variable has been largely neglected in the NLP and
language model literature. In this paper, we present TimeLMs, a set of language models …
language model literature. In this paper, we present TimeLMs, a set of language models …
Codegen2: Lessons for training llms on programming and natural languages
Large language models (LLMs) have demonstrated remarkable abilities in representation
learning for program synthesis and understanding tasks. The quality of the learned …
learning for program synthesis and understanding tasks. The quality of the learned …
D4: Improving llm pretraining via document de-duplication and diversification
Over recent years, an increasing amount of compute and data has been poured into training
large language models (LLMs), usually by doing one-pass learning on as many tokens as …
large language models (LLMs), usually by doing one-pass learning on as many tokens as …
Tweeteval: Unified benchmark and comparative evaluation for tweet classification
The experimental landscape in natural language processing for social media is too
fragmented. Each year, new shared tasks and datasets are proposed, ranging from classics …
fragmented. Each year, new shared tasks and datasets are proposed, ranging from classics …
Documenting large webtext corpora: A case study on the colossal clean crawled corpus
Large language models have led to remarkable progress on many NLP tasks, and
researchers are turning to ever-larger text corpora to train them. Some of the largest corpora …
researchers are turning to ever-larger text corpora to train them. Some of the largest corpora …
Ammus: A survey of transformer-based pretrained models in natural language processing
KS Kalyan, A Rajasekharan, S Sangeetha - arXiv preprint arXiv …, 2021 - arxiv.org
Transformer-based pretrained language models (T-PTLMs) have achieved great success in
almost every NLP task. The evolution of these models started with GPT and BERT. These …
almost every NLP task. The evolution of these models started with GPT and BERT. These …
Octopack: Instruction tuning code large language models
Finetuning large language models (LLMs) on instructions leads to vast performance
improvements on natural language tasks. We apply instruction tuning using code …
improvements on natural language tasks. We apply instruction tuning using code …