From google gemini to openai q*(q-star): A survey of reshaping the generative artificial intelligence (ai) research landscape

TR McIntosh, T Susnjak, T Liu, P Watters… - arXiv preprint arXiv …, 2023 - arxiv.org
This comprehensive survey explored the evolving landscape of generative Artificial
Intelligence (AI), with a specific focus on the transformative impacts of Mixture of Experts …

[PDF][PDF] Efficient large language models: A survey

Z Wan, X Wang, C Liu, S Alam, Y Zheng… - arXiv preprint arXiv …, 2023 - researchgate.net
Abstract Large Language Models (LLMs) have demonstrated remarkable capabilities in
important tasks such as natural language understanding, language generation, and …

Scaling vision-language models with sparse mixture of experts

S Shen, Z Yao, C Li, T Darrell, K Keutzer… - arXiv preprint arXiv …, 2023 - arxiv.org
The field of natural language processing (NLP) has made significant strides in recent years,
particularly in the development of large-scale vision-language models (VLMs). These …

Trends and challenges of real-time learning in large language models: A critical review

M Jovanovic, P Voss - arXiv preprint arXiv:2404.18311, 2024 - arxiv.org
Real-time learning concerns the ability of learning systems to acquire knowledge over time,
enabling their adaptation and generalization to novel tasks. It is a critical ability for …

Conpet: Continual parameter-efficient tuning for large language models

C Song, X Han, Z Zeng, K Li, C Chen, Z Liu… - arXiv preprint arXiv …, 2023 - arxiv.org
Continual learning necessitates the continual adaptation of models to newly emerging tasks
while minimizing the catastrophic forgetting of old ones. This is extremely challenging for …

Enable language models to implicitly learn self-improvement from data

Z Wang, L Hou, T Lu, Y Wu, Y Li, H Yu, H Ji - arXiv preprint arXiv …, 2023 - arxiv.org
Large Language Models (LLMs) have demonstrated remarkable capabilities in open-ended
text generation tasks. However, the inherent open-ended nature of these tasks implies that …

[HTML][HTML] TeenyTinyLlama: open-source tiny language models trained in Brazilian Portuguese

NK Corrêa, S Falk, S Fatimah, A Sen… - Machine Learning with …, 2024 - Elsevier
Large language models (LLMs) have significantly advanced natural language processing,
but their progress has yet to be equal across languages. While most LLMs are trained in …

Dense Training, Sparse Inference: Rethinking Training of Mixture-of-Experts Language Models

B Pan, Y Shen, H Liu, M Mishra, G Zhang… - arXiv preprint arXiv …, 2024 - arxiv.org
Mixture-of-Experts (MoE) language models can reduce computational costs by 2-4$\times $
compared to dense models without sacrificing performance, making them more efficient in …

Pushing The Limit of LLM Capacity for Text Classification

Y Zhang, M Wang, C Ren, Q Li, P Tiwari… - arXiv preprint arXiv …, 2024 - arxiv.org
The value of text classification's future research has encountered challenges and
uncertainties, due to the extraordinary efficacy demonstrated by large language models …

From Automation to Augmentation: Large Language Models Elevating Essay Scoring Landscape

C Xiao, W Ma, SX Xu, K Zhang, Y Wang… - arXiv preprint arXiv …, 2024 - arxiv.org
Receiving immediate and personalized feedback is crucial for second-language learners,
and Automated Essay Scoring (AES) systems are a vital resource when human instructors …