Advancing transformer architecture in long-context large language models: A comprehensive survey
With the bomb ignited by ChatGPT, Transformer-based Large Language Models (LLMs)
have paved a revolutionary path toward Artificial General Intelligence (AGI) and have been …
have paved a revolutionary path toward Artificial General Intelligence (AGI) and have been …
Retrieval-augmented generation for natural language processing: A survey
Large language models (LLMs) have demonstrated great success in various fields,
benefiting from their huge amount of parameters that store knowledge. However, LLMs still …
benefiting from their huge amount of parameters that store knowledge. However, LLMs still …
A survey of large language models
Language is essentially a complex, intricate system of human expressions governed by
grammatical rules. It poses a significant challenge to develop capable AI algorithms for …
grammatical rules. It poses a significant challenge to develop capable AI algorithms for …
The rise and potential of large language model based agents: A survey
For a long time, humanity has pursued artificial intelligence (AI) equivalent to or surpassing
the human level, with AI agents considered a promising vehicle for this pursuit. AI agents are …
the human level, with AI agents considered a promising vehicle for this pursuit. AI agents are …
Generating images with multimodal language models
We propose a method to fuse frozen text-only large language models (LLMs) with pre-
trained image encoder and decoder models, by mapping between their embedding spaces …
trained image encoder and decoder models, by mapping between their embedding spaces …
Efficient large language models: A survey
Large Language Models (LLMs) have demonstrated remarkable capabilities in important
tasks such as natural language understanding and language generation, and thus have the …
tasks such as natural language understanding and language generation, and thus have the …
Scaling transformer to 1m tokens and beyond with rmt
A major limitation for the broader scope of problems solvable by transformers is the
quadratic scaling of computational complexity with input size. In this study, we investigate …
quadratic scaling of computational complexity with input size. In this study, we investigate …
Llm inference unveiled: Survey and roofline model insights
The field of efficient Large Language Model (LLM) inference is rapidly evolving, presenting a
unique blend of opportunities and challenges. Although the field has expanded and is …
unique blend of opportunities and challenges. Although the field has expanded and is …
Exposing attention glitches with flip-flop language modeling
Why do large language models sometimes output factual inaccuracies and exhibit
erroneous reasoning? The brittleness of these models, particularly when executing long …
erroneous reasoning? The brittleness of these models, particularly when executing long …
Visualwebarena: Evaluating multimodal agents on realistic visual web tasks
Autonomous agents capable of planning, reasoning, and executing actions on the web offer
a promising avenue for automating computer tasks. However, the majority of existing …
a promising avenue for automating computer tasks. However, the majority of existing …