Polylm: An open source polyglot large language model
Large language models (LLMs) demonstrate remarkable ability to comprehend, reason, and
generate following nature language instructions. However, the development of LLMs has …
generate following nature language instructions. However, the development of LLMs has …
Generative language models for paragraph-level question generation
A Ushio, F Alva-Manchego… - arXiv preprint arXiv …, 2022 - arxiv.org
Powerful generative models have led to recent progress in question generation (QG).
However, it is difficult to measure advances in QG research since there are no standardized …
However, it is difficult to measure advances in QG research since there are no standardized …
mGPT: Few-Shot Learners Go Multilingual
This paper introduces mGPT, a multilingual variant of GPT-3, pretrained on 61 languages
from 25 linguistically diverse language families using Wikipedia and the C4 Corpus. We …
from 25 linguistically diverse language families using Wikipedia and the C4 Corpus. We …
Text embedding inversion security for multilingual language models
Textual data is often represented as real-numbered embeddings in NLP, particularly with the
popularity of large language models (LLMs) and Embeddings as a Service (EaaS) …
popularity of large language models (LLMs) and Embeddings as a Service (EaaS) …
Interpretable long-form legal question answering with retrieval-augmented large language models
A Louis, G van Dijck, G Spanakis - … of the AAAI Conference on Artificial …, 2024 - ojs.aaai.org
Many individuals are likely to face a legal dispute at some point in their lives, but their lack of
understanding of how to navigate these complex issues often renders them vulnerable. The …
understanding of how to navigate these complex issues often renders them vulnerable. The …
IndicNLG benchmark: Multilingual datasets for diverse NLG tasks in Indic languages
A Kumar, H Shrotriya, P Sahu, R Dabre… - arXiv preprint arXiv …, 2022 - arxiv.org
Natural Language Generation (NLG) for non-English languages is hampered by the scarcity
of datasets in these languages. In this paper, we present the IndicNLG Benchmark, a …
of datasets in these languages. In this paper, we present the IndicNLG Benchmark, a …
Why Does Zero-Shot Cross-Lingual Generation Fail? An Explanation and a Solution
Zero-shot cross-lingual transfer is when a multilingual model is trained to perform a task in
one language and then is applied to another language. Although the zero-shot cross-lingual …
one language and then is applied to another language. Although the zero-shot cross-lingual …
Scale: Scaling up the complexity for advanced language model evaluation
Recent strides in Large Language Models (LLMs) have saturated many NLP benchmarks
(even professional domain-specific ones), emphasizing the need for novel, more …
(even professional domain-specific ones), emphasizing the need for novel, more …
Little red riding hood goes around the globe: Crosslingual story planning and generation with large language models
Previous work has demonstrated the effectiveness of planning for story generation
exclusively in a monolingual setting focusing primarily on English. We consider whether …
exclusively in a monolingual setting focusing primarily on English. We consider whether …
LexSumm and LexT5: Benchmarking and Modeling Legal Summarization Tasks in English
T Santosh, C Weiss, M Grabmair - arXiv preprint arXiv:2410.09527, 2024 - arxiv.org
In the evolving NLP landscape, benchmarks serve as yardsticks for gauging progress.
However, existing Legal NLP benchmarks only focus on predictive tasks, overlooking …
However, existing Legal NLP benchmarks only focus on predictive tasks, overlooking …