A comprehensive overview of large language models

H Naveed, AU Khan, S Qiu, M Saqib, S Anwar… - arXiv preprint arXiv …, 2023 - arxiv.org
Large Language Models (LLMs) have recently demonstrated remarkable capabilities in
natural language processing tasks and beyond. This success of LLMs has led to a large …

[HTML][HTML] A survey on text classification algorithms: From text to predictions

A Gasparetto, M Marcuzzo, A Zangari, A Albarelli - Information, 2022 - mdpi.com
In recent years, the exponential growth of digital documents has been met by rapid progress
in text classification techniques. Newly proposed machine learning algorithms leverage the …

Bloom: A 176b-parameter open-access multilingual language model

T Le Scao, A Fan, C Akiki, E Pavlick, S Ilić, D Hesslow… - 2023 - inria.hal.science
Large language models (LLMs) have been shown to be able to perform new tasks based on
a few demonstrations or natural language instructions. While these capabilities have led to …

Language model tokenizers introduce unfairness between languages

A Petrov, E La Malfa, P Torr… - Advances in Neural …, 2024 - proceedings.neurips.cc
Recent language models have shown impressive multilingual performance, even when not
explicitly trained for it. Despite this, there are concerns about the quality of their outputs …

Character-aware models improve visual text rendering

R Liu, D Garrette, C Saharia, W Chan… - arXiv preprint arXiv …, 2022 - arxiv.org
Current image generation models struggle to reliably produce well-formed visual text. In this
paper, we investigate a key contributing factor: popular text-to-image models lack character …

Clippo: Image-and-language understanding from pixels only

M Tschannen, B Mustafa… - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com
Multimodal models are becoming increasingly effective, in part due to unified components,
such as the Transformer architecture. However, multimodal models still often consist of many …

Linguistically inspired roadmap for building biologically reliable protein language models

MH Vu, R Akbar, PA Robert, B Swiatczak… - Nature Machine …, 2023 - nature.com
Deep neural-network-based language models (LMs) are increasingly applied to large-scale
protein sequence data to predict protein function. However, being largely black-box models …

[HTML][HTML] mGPT: Few-Shot Learners Go Multilingual

O Shliazhko, A Fenogenova, M Tikhonova… - Transactions of the …, 2024 - direct.mit.edu
This paper introduces mGPT, a multilingual variant of GPT-3, pretrained on 61 languages
from 25 linguistically diverse language families using Wikipedia and the C4 Corpus. We …

The SIGMORPHON 2022 shared task on morpheme segmentation

K Batsuren, G Bella, A Arora, V Martinović… - arXiv preprint arXiv …, 2022 - arxiv.org
The SIGMORPHON 2022 shared task on morpheme segmentation challenged systems to
decompose a word into a sequence of morphemes and covered most types of morphology …

Text generation with text-editing models

E Malmi, Y Dong, J Mallinson, A Chuklin… - arXiv preprint arXiv …, 2022 - arxiv.org
Text-editing models have recently become a prominent alternative to seq2seq models for
monolingual text-generation tasks such as grammatical error correction, simplification, and …