Leveraging Large Language Models for Integrated Satellite-Aerial-Terrestrial Networks: Recent Advances and Future Directions

S Javaid, RA Khalil, N Saeed, B He… - arXiv preprint arXiv …, 2024 - arxiv.org
Integrated satellite, aerial, and terrestrial networks (ISATNs) represent a sophisticated
convergence of diverse communication technologies to ensure seamless connectivity …

Efficient training and inference: Techniques for large language models using llama

SR Cunningham, D Archambault, A Kung - Authorea Preprints, 2024 - techrxiv.org
To enhance the efficiency of language models, it would involve optimizing their training and
inference processes to reduce computational demands while maintaining high performance …

On-Device Language Models: A Comprehensive Review

J Xu, Z Li, W Chen, Q Wang, X Gao, Q Cai… - arXiv preprint arXiv …, 2024 - arxiv.org
The advent of large language models (LLMs) revolutionized natural language processing
applications, and running LLMs on edge devices has become increasingly attractive for …

PerLLM: Personalized Inference Scheduling with Edge-Cloud Collaboration for Diverse LLM Services

Z Yang, Y Yang, C Zhao, Q Guo, W He, W Ji - arXiv preprint arXiv …, 2024 - arxiv.org
With the rapid growth in the number of large language model (LLM) users, it is difficult for
bandwidth-constrained cloud servers to simultaneously process massive LLM services in …

Integrating Qt and LLMs on the NVIDIA Jetson board for controlling a patient-assisting robot arm

S Hashemi - 2024 - oulurepo.oulu.fi
This thesis investigates the integration of generative artificial intelligence (GenAI) and large
language models (LLMs) with edge devices to improve patient care. The project focuses on …

TinyAgent: Quantization-aware Model Compression and Adaptation for On-device LLM Agent Deployment

J Kong, L Hu, F Ponzina, T Rosing - Workshop on Efficient Systems for … - openreview.net
Deploying LLMs on edge devices is challenging due to stringent memory resources and
compute constraints. In edge applications, existing deployment solutions for LLM agents …

Optimizing Large Language Model Scaling with Micro Batch Pipeline and Inference Parallelism

D Quan, R Wang, Z Lian, N Wang - 2024 - researchsquare.com
Natural language processing has seen transformative progress with the development of
sophisticated models capable of generating and understanding human language with high …