作者
Souvika Sarkar, Mohammad Fakhruddin Babar, Monowar Hasan, Shubhra Kanti Karmaker
发表日期
2024/1/29
期刊
arXiv preprint arXiv:2401.16577
简介
Large Language Models (LLMs) have demonstrated remarkable language understanding and generation capabilities. However, training, deploying, and accessing these models pose notable challenges, including resource-intensive demands, extended training durations, and scalability issues. To address these issues, we introduce a concept of hierarchical, distributed LLM architecture that aims at enhancing the accessibility and deployability of LLMs across heterogeneous computing platforms, including general-purpose computers (e.g., laptops) and IoT-style devices (e.g., embedded systems). By introducing a "layered" approach, the proposed architecture enables on-demand accessibility to LLMs as a customizable service. This approach also ensures optimal trade-offs between the available computational resources and the user's application needs. We envision that the concept of hierarchical LLM will empower extensive, crowd-sourced user bases to harness the capabilities of LLMs, thereby fostering advancements in AI technology in general.
引用总数
学术搜索中的文章
S Sarkar, MF Babar, M Hasan, SK Karmaker - arXiv preprint arXiv:2401.16577, 2024