An empirical analysis and resource footprint study of deploying large language models on...

Leveraging Large Language Models for Integrated Satellite-Aerial-Terrestrial Networks: Recent Advances and Future Directions

S Javaid, RA Khalil, N Saeed, B He… - arXiv preprint arXiv …, 2024 - arxiv.org

Integrated satellite, aerial, and terrestrial networks (ISATNs) represent a sophisticated
convergence of diverse communication technologies to ensure seamless connectivity …

被引用次数：2 相关文章所有 2 个版本

[PDF] techrxiv.org

Efficient training and inference: Techniques for large language models using llama

SR Cunningham, D Archambault, A Kung - Authorea Preprints, 2024 - techrxiv.org

To enhance the efficiency of language models, it would involve optimizing their training and
inference processes to reduce computational demands while maintaining high performance …

被引用次数：19 相关文章所有 3 个版本

[PDF] arxiv.org

On-Device Language Models: A Comprehensive Review

J Xu, Z Li, W Chen, Q Wang, X Gao, Q Cai… - arXiv preprint arXiv …, 2024 - arxiv.org

The advent of large language models (LLMs) revolutionized natural language processing
applications, and running LLMs on edge devices has become increasingly attractive for …

PerLLM: Personalized Inference Scheduling with Edge-Cloud Collaboration for Diverse LLM Services

Z Yang, Y Yang, C Zhao, Q Guo, W He, W Ji - arXiv preprint arXiv …, 2024 - arxiv.org

With the rapid growth in the number of large language model (LLM) users, it is difficult for
bandwidth-constrained cloud servers to simultaneously process massive LLM services in …

被引用次数：2 相关文章所有 2 个版本

[PDF] oulu.fi

Integrating Qt and LLMs on the NVIDIA Jetson board for controlling a patient-assisting robot arm

S Hashemi - 2024 - oulurepo.oulu.fi

This thesis investigates the integration of generative artificial intelligence (GenAI) and large
language models (LLMs) with edge devices to improve patient care. The project focuses on …

[PDF] openreview.net

TinyAgent: Quantization-aware Model Compression and Adaptation for On-device LLM Agent Deployment

J Kong, L Hu, F Ponzina, T Rosing - Workshop on Efficient Systems for … - openreview.net

Deploying LLMs on edge devices is challenging due to stringent memory resources and
compute constraints. In edge applications, existing deployment solutions for LLM agents …

[PDF] researchsquare.com

Optimizing Large Language Model Scaling with Micro Batch Pipeline and Inference Parallelism

D Quan, R Wang, Z Lian, N Wang - 2024 - researchsquare.com

Natural language processing has seen transformative progress with the development of
sophisticated models capable of generating and understanding human language with high …