Challenges and applications of large language models

J Kaddour, J Harris, M Mozes, H Bradley… - arXiv preprint arXiv …, 2023 - arxiv.org
Large Language Models (LLMs) went from non-existent to ubiquitous in the machine
learning discourse within a few years. Due to the fast pace of the field, it is difficult to identify …

Natural language generation and understanding of big code for AI-assisted programming: A review

MF Wong, S Guo, CN Hang, SW Ho, CW Tan - Entropy, 2023 - mdpi.com
This paper provides a comprehensive review of the literature concerning the utilization of
Natural Language Processing (NLP) techniques, with a particular focus on transformer …

Is your code generated by chatgpt really correct? rigorous evaluation of large language models for code generation

J Liu, CS Xia, Y Wang, L Zhang - Advances in Neural …, 2024 - proceedings.neurips.cc
Program synthesis has been long studied with recent approaches focused on directly using
the power of Large Language Models (LLMs) to generate code. Programming benchmarks …

C-pack: Packed resources for general chinese embeddings

S Xiao, Z Liu, P Zhang, N Muennighoff, D Lian… - Proceedings of the 47th …, 2024 - dl.acm.org
We introduce C-Pack, a package of resources that significantly advances the field of general
text embeddings for Chinese. C-Pack includes three critical resources. 1) C-MTP is a …

Starcoder: may the source be with you!

R Li, LB Allal, Y Zi, N Muennighoff, D Kocetkov… - arXiv preprint arXiv …, 2023 - arxiv.org
The BigCode community, an open-scientific collaboration working on the responsible
development of Large Language Models for Code (Code LLMs), introduces StarCoder and …

Qwen technical report

J Bai, S Bai, Y Chu, Z Cui, K Dang, X Deng… - arXiv preprint arXiv …, 2023 - arxiv.org
Large language models (LLMs) have revolutionized the field of artificial intelligence,
enabling natural language processing tasks that were previously thought to be exclusive to …

Scaling data-constrained language models

N Muennighoff, A Rush, B Barak… - Advances in …, 2023 - proceedings.neurips.cc
The current trend of scaling language models involves increasing both parameter count and
training dataset size. Extrapolating this trend suggests that training dataset size may soon be …

Crosslingual generalization through multitask finetuning

N Muennighoff, T Wang, L Sutawika, A Roberts… - arXiv preprint arXiv …, 2022 - arxiv.org
Multitask prompted finetuning (MTF) has been shown to help large language models
generalize to new tasks in a zero-shot setting, but so far explorations of MTF have focused …

Textbooks are all you need

S Gunasekar, Y Zhang, J Aneja, CCT Mendes… - arXiv preprint arXiv …, 2023 - arxiv.org
We introduce phi-1, a new large language model for code, with significantly smaller size
than competing models: phi-1 is a Transformer-based model with 1.3 B parameters, trained …

MTEB: Massive text embedding benchmark

N Muennighoff, N Tazi, L Magne, N Reimers - arXiv preprint arXiv …, 2022 - arxiv.org
Text embeddings are commonly evaluated on a small set of datasets from a single task not
covering their possible applications to other tasks. It is unclear whether state-of-the-art …