A comprehensive overview of large language models
Large Language Models (LLMs) have recently demonstrated remarkable capabilities in
natural language processing tasks and beyond. This success of LLMs has led to a large …
natural language processing tasks and beyond. This success of LLMs has led to a large …
A survey of large language models
Language is essentially a complex, intricate system of human expressions governed by
grammatical rules. It poses a significant challenge to develop capable AI algorithms for …
grammatical rules. It poses a significant challenge to develop capable AI algorithms for …
A survey on model compression for large language models
Large Language Models (LLMs) have revolutionized natural language processing tasks with
remarkable success. However, their formidable size and computational demands present …
remarkable success. However, their formidable size and computational demands present …
Omniquant: Omnidirectionally calibrated quantization for large language models
Large language models (LLMs) have revolutionized natural language processing tasks.
However, their practical deployment is hindered by their immense memory and computation …
However, their practical deployment is hindered by their immense memory and computation …
Loftq: Lora-fine-tuning-aware quantization for large language models
Quantization is an indispensable technique for serving Large Language Models (LLMs) and
has recently found its way into LoRA fine-tuning. In this work we focus on the scenario where …
has recently found its way into LoRA fine-tuning. In this work we focus on the scenario where …
Qa-lora: Quantization-aware low-rank adaptation of large language models
Recently years have witnessed a rapid development of large language models (LLMs).
Despite the strong ability in many language-understanding tasks, the heavy computational …
Despite the strong ability in many language-understanding tasks, the heavy computational …
A survey on transformer compression
Large models based on the Transformer architecture play increasingly vital roles in artificial
intelligence, particularly within the realms of natural language processing (NLP) and …
intelligence, particularly within the realms of natural language processing (NLP) and …
Spikegpt: Generative pre-trained language model with spiking neural networks
As the size of large language models continue to scale, so does the computational
resources required to run it. Spiking Neural Networks (SNNs) have emerged as an energy …
resources required to run it. Spiking Neural Networks (SNNs) have emerged as an energy …
Towards efficient generative large language model serving: A survey from algorithms to systems
In the rapidly evolving landscape of artificial intelligence (AI), generative large language
models (LLMs) stand at the forefront, revolutionizing how we interact with our data. However …
models (LLMs) stand at the forefront, revolutionizing how we interact with our data. However …
Relu strikes back: Exploiting activation sparsity in large language models
Large Language Models (LLMs) with billions of parameters have drastically transformed AI
applications. However, their demanding computation during inference has raised significant …
applications. However, their demanding computation during inference has raised significant …