[PDF][PDF] The efficiency spectrum of large language models: An algorithmic survey
The rapid growth of Large Language Models (LLMs) has been a driving force in
transforming various domains, reshaping the artificial general intelligence landscape …
transforming various domains, reshaping the artificial general intelligence landscape …
[PDF][PDF] Corpus Complexity Matters in Pretraining Language Models
A Agrawal, S Singh - Proceedings of The Fourth Workshop on …, 2023 - aclanthology.org
It is well known that filtering low-quality data before pretraining language models or
selecting suitable data from domains similar to downstream task datasets generally leads to …
selecting suitable data from domains similar to downstream task datasets generally leads to …
CLIMB: Curriculum Learning for Infant-inspired Model Building
We describe our team's contribution to the STRICT-SMALL track of the BabyLM Challenge.
The challenge requires training a language model from scratch using only a relatively small …
The challenge requires training a language model from scratch using only a relatively small …
Improving Complex Reasoning over Knowledge Graph with Logic-Aware Curriculum Tuning
Answering complex logical queries over incomplete knowledge graphs (KGs) is challenging.
Most previous works have focused on learning entity/relation embeddings and simulating …
Most previous works have focused on learning entity/relation embeddings and simulating …
Mitigating Frequency Bias and Anisotropy in Language Model Pre-Training with Syntactic Smoothing
Language models strongly rely on frequency information because they maximize the
likelihood of tokens during pre-training. As a consequence, language models tend to not …
likelihood of tokens during pre-training. As a consequence, language models tend to not …