A survey of resource-efficient llm and multimodal foundation models
Large foundation models, including large language models (LLMs), vision transformers
(ViTs), diffusion, and LLM-based multimodal models, are revolutionizing the entire machine …
(ViTs), diffusion, and LLM-based multimodal models, are revolutionizing the entire machine …
Resource-efficient Algorithms and Systems of Foundation Models: A Survey
Large foundation models, including large language models, vision transformers, diffusion,
and LLM-based multimodal models, are revolutionizing the entire machine learning …
and LLM-based multimodal models, are revolutionizing the entire machine learning …
A survey on efficient inference for large language models
Large Language Models (LLMs) have attracted extensive attention due to their remarkable
performance across various tasks. However, the substantial computational and memory …
performance across various tasks. However, the substantial computational and memory …
PaCE: Parsimonious Concept Engineering for Large Language Models
Large Language Models (LLMs) are being used for a wide variety of tasks. While they are
capable of generating human-like responses, they can also produce undesirable output …
capable of generating human-like responses, they can also produce undesirable output …
OTOv3: Automatic Architecture-Agnostic Neural Network Training and Compression from Structured Pruning to Erasing Operators
Compressing a predefined deep neural network (DNN) into a compact sub-network with
competitive performance is crucial in the efficient machine learning realm. This topic spans …
competitive performance is crucial in the efficient machine learning realm. This topic spans …
HESSO: Towards Automatic Efficient and User Friendly Any Neural Network Training and Pruning
Structured pruning is one of the most popular approaches to effectively compress the heavy
deep neural networks (DNNs) into compact sub-networks while retaining performance. The …
deep neural networks (DNNs) into compact sub-networks while retaining performance. The …
A Survey on Large Language Model Acceleration based on KV Cache Management
H Li, Y Li, A Tian, T Tang, Z Xu, X Chen, N Hu… - arXiv preprint arXiv …, 2024 - arxiv.org
Large Language Models (LLMs) have revolutionized a wide range of domains such as
natural language processing, computer vision, and multi-modal tasks due to their ability to …
natural language processing, computer vision, and multi-modal tasks due to their ability to …
Arlo: Serving Transformer-based Language Models with Dynamic Input Lengths
A prominent challenge in serving requests for NLP tasks is handling the varying length of
input texts. Existing solutions, such as uniform zero-padding and compiler support, suffer …
input texts. Existing solutions, such as uniform zero-padding and compiler support, suffer …
A Text Classification Model Combining Adversarial Training with Pre-trained Language Model and neural networks: A Case Study on Telecom Fraud Incident Texts
L Zhuoxian, S Tuo, H Xiaofeng - arXiv preprint arXiv:2411.06772, 2024 - arxiv.org
Front-line police officers often categorize all police call reported cases of Telecom Fraud into
14 subcategories to facilitate targeted prevention measures, such as precise public …
14 subcategories to facilitate targeted prevention measures, such as precise public …
Efficiency in Language Understanding and Generation: An Evaluation of Four Open-Source Large Language Models
SM Wong, H Leung, KY Wong - 2024 - researchsquare.com
This study provides a comprehensive evaluation of the efficiency of Large Language Models
(LLMs) in performing diverse language understanding and generation tasks. Through a …
(LLMs) in performing diverse language understanding and generation tasks. Through a …