Small models are valuable plug-ins for large language models

C Xu, Y Xu, S Wang, Y Liu, C Zhu… - arXiv preprint arXiv …, 2023 - arxiv.org
Large language models (LLMs) such as GPT-3 and GPT-4 are powerful but their weights are
often publicly unavailable and their immense sizes make the models difficult to be tuned with …

Multilingual machine translation with large language models: Empirical results and analysis

W Zhu, H Liu, Q Dong, J Xu, S Huang, L Kong… - arXiv preprint arXiv …, 2023 - arxiv.org
Large language models (LLMs) have demonstrated remarkable potential in handling
multilingual machine translation (MMT). In this paper, we systematically investigate the …

Os-copilot: Towards generalist computer agents with self-improvement

Z Wu, C Han, Z Ding, Z Weng, Z Liu, S Yao… - arXiv preprint arXiv …, 2024 - arxiv.org
Autonomous interaction with the computer has been a longstanding challenge with great
potential, and the recent proliferation of large language models (LLMs) has markedly …

MetaAdapt: Domain adaptive few-shot misinformation detection via meta learning

Z Yue, H Zeng, Y Zhang, L Shang, D Wang - arXiv preprint arXiv …, 2023 - arxiv.org
With emerging topics (eg, COVID-19) on social media as a source for the spreading
misinformation, overcoming the distributional shifts between the original training domain (ie …

Forward-backward reasoning in large language models for mathematical verification

W Jiang, H Shi, L Yu, Z Liu, Y Zhang, Z Li… - Findings of the …, 2024 - aclanthology.org
Self-Consistency samples diverse reasoning chains with answers and chooses the final
answer by majority voting. It is based on forward reasoning and cannot further improve …

Chef: A comprehensive evaluation framework for standardized assessment of multimodal large language models

Z Shi, Z Wang, H Fan, Z Yin, L Sheng, Y Qiao… - arXiv preprint arXiv …, 2023 - arxiv.org
Multimodal Large Language Models (MLLMs) have shown impressive abilities in interacting
with visual content with myriad potential downstream tasks. However, even though a list of …

LLMeBench: A flexible framework for accelerating llms benchmarking

F Dalvi, M Hasanain, S Boughorbel, B Mousi… - arXiv preprint arXiv …, 2023 - arxiv.org
The recent development and success of Large Language Models (LLMs) necessitate an
evaluation of their performance across diverse NLP tasks in different languages. Although …

Language versatilists vs. specialists: An empirical revisiting on multilingual transfer ability

J Ye, X Tao, L Kong - arXiv preprint arXiv:2306.06688, 2023 - arxiv.org
Multilingual transfer ability, which reflects how well the models fine-tuned on one source
language can be applied to other languages, has been well studied in multilingual pre …

In-context demonstration selection with cross entropy difference

D Iter, R Pryzant, R Xu, S Wang, Y Liu, Y Xu… - arXiv preprint arXiv …, 2023 - arxiv.org
Large language models (LLMs) can use in-context demonstrations to improve performance
on zero-shot tasks. However, selecting the best in-context examples is challenging because …

EMO: Earth Mover Distance Optimization for Auto-Regressive Language Modeling

S Ren, Z Wu, KQ Zhu - arXiv preprint arXiv:2310.04691, 2023 - arxiv.org
Neural language models are probabilistic models of human text. They are predominantly
trained using maximum likelihood estimation (MLE), which is equivalent to minimizing the …