Scaling instruction-finetuned language models
Finetuning language models on a collection of datasets phrased as instructions has been
shown to improve model performance and generalization to unseen tasks. In this paper we …
shown to improve model performance and generalization to unseen tasks. In this paper we …
Crosslingual generalization through multitask finetuning
Multitask prompted finetuning (MTF) has been shown to help large language models
generalize to new tasks in a zero-shot setting, but so far explorations of MTF have focused …
generalize to new tasks in a zero-shot setting, but so far explorations of MTF have focused …
Specializing smaller language models towards multi-step reasoning
The surprising ability of Large Language Models (LLMs) to perform well on complex
reasoning with only few-shot chain-of-thought prompts is believed to emerge only in very …
reasoning with only few-shot chain-of-thought prompts is believed to emerge only in very …
Flask: Fine-grained language model evaluation based on alignment skill sets
Evaluation of Large Language Models (LLMs) is challenging because aligning to human
values requires the composition of multiple skills and the required set of skills varies …
values requires the composition of multiple skills and the required set of skills varies …
[PDF][PDF] The efficiency spectrum of large language models: An algorithmic survey
The rapid growth of Large Language Models (LLMs) has been a driving force in
transforming various domains, reshaping the artificial general intelligence landscape …
transforming various domains, reshaping the artificial general intelligence landscape …
Adapting large language models for document-level machine translation
Large language models (LLMs) have made significant strides in various natural language
processing (NLP) tasks. Recent research shows that the moderately-sized LLMs often …
processing (NLP) tasks. Recent research shows that the moderately-sized LLMs often …
Defining a new NLP playground
The recent explosion of performance of large language models (LLMs) has changed the
field of Natural Language Processing (NLP) more abruptly and seismically than any other …
field of Natural Language Processing (NLP) more abruptly and seismically than any other …
Exploring the numerical reasoning capabilities of language models: A comprehensive analysis on tabular data
Numbers are crucial for various real-world domains such as finance, economics, and
science. Thus, understanding and reasoning with numbers are essential skills for language …
science. Thus, understanding and reasoning with numbers are essential skills for language …
Mixture-of-Linear-Experts for Long-term Time Series Forecasting
Long-term time series forecasting (LTSF) aims to predict future values of a time series given
the past values. The current state-of-the-art (SOTA) on this problem is attained in some …
the past values. The current state-of-the-art (SOTA) on this problem is attained in some …
: Towards Building Reliable Language Models with Sparse Mixture-of-Experts
Mixture-of-Experts (MoE) has gained increasing popularity as a promising framework for
scaling up large language models (LLMs). However, the reliability assessment of MoE lags …
scaling up large language models (LLMs). However, the reliability assessment of MoE lags …