Scaling laws for downstream task performance of large language models

B Isik, N Ponomareva, H Hazimeh, D Paparas… - arXiv preprint arXiv …, 2024 - arxiv.org
Scaling laws provide important insights that can guide the design of large language models
(LLMs). Existing work has primarily focused on studying scaling laws for pretraining …

Collaborative Performance Prediction for Large Language Models

Q Zhang, F Lyu, X Liu, C Ma - arXiv preprint arXiv:2407.01300, 2024 - arxiv.org
Comprehensively understanding and accurately predicting the performance of large
language models across diverse downstream tasks has emerged as a pivotal challenge in …

Granularity is crucial when applying differential privacy to text: An investigation for neural machine translation

DNL Vu, T Igamberdiev, I Habernal - arXiv preprint arXiv:2407.18789, 2024 - arxiv.org
Applying differential privacy (DP) by means of the DP-SGD algorithm to protect individual
data points during training is becoming increasingly popular in NLP. However, the choice of …