Rwkv: Reinventing rnns for the transformer era
Transformers have revolutionized almost all natural language processing (NLP) tasks but
suffer from memory and computational complexity that scales quadratically with sequence …
suffer from memory and computational complexity that scales quadratically with sequence …
Glm-130b: An open bilingual pre-trained model
We introduce GLM-130B, a bilingual (English and Chinese) pre-trained language model
with 130 billion parameters. It is an attempt to open-source a 100B-scale model at least as …
with 130 billion parameters. It is an attempt to open-source a 100B-scale model at least as …
Unified-io: A unified model for vision, language, and multi-modal tasks
We propose Unified-IO, a model that performs a large variety of AI tasks spanning classical
computer vision tasks, including pose estimation, object detection, depth estimation and …
computer vision tasks, including pose estimation, object detection, depth estimation and …
Palm: Scaling language modeling with pathways
Large language models have been shown to achieve remarkable performance across a
variety of natural language tasks using few-shot learning, which drastically reduces the …
variety of natural language tasks using few-shot learning, which drastically reduces the …
Large language models can self-improve
Large Language Models (LLMs) have achieved excellent performances in various tasks.
However, fine-tuning an LLM requires extensive supervision. Human, on the other hand …
However, fine-tuning an LLM requires extensive supervision. Human, on the other hand …
Visual prompt tuning
The current modus operandi in adapting pre-trained models involves updating all the
backbone parameters, ie., full fine-tuning. This paper introduces Visual Prompt Tuning (VPT) …
backbone parameters, ie., full fine-tuning. This paper introduces Visual Prompt Tuning (VPT) …
Scaling data-constrained language models
The current trend of scaling language models involves increasing both parameter count and
training dataset size. Extrapolating this trend suggests that training dataset size may soon be …
training dataset size. Extrapolating this trend suggests that training dataset size may soon be …
The debate over understanding in AI's large language models
M Mitchell, DC Krakauer - Proceedings of the National …, 2023 - National Acad Sciences
We survey a current, heated debate in the artificial intelligence (AI) research community on
whether large pretrained language models can be said to understand language—and the …
whether large pretrained language models can be said to understand language—and the …
Rethinking the role of demonstrations: What makes in-context learning work?
Large language models (LMs) are able to in-context learn--perform a new task via inference
alone by conditioning on a few input-label pairs (demonstrations) and making predictions for …
alone by conditioning on a few input-label pairs (demonstrations) and making predictions for …
Larger language models do in-context learning differently
We study how in-context learning (ICL) in language models is affected by semantic priors
versus input-label mappings. We investigate two setups-ICL with flipped labels and ICL with …
versus input-label mappings. We investigate two setups-ICL with flipped labels and ICL with …