XTREME-R: Towards more challenging and nuanced multilingual evaluation
Machine learning has brought striking advances in multilingual natural language processing
capabilities over the past year. For example, the latest techniques have improved the state …
capabilities over the past year. For example, the latest techniques have improved the state …
Massive: A 1m-example multilingual natural language understanding dataset with 51 typologically-diverse languages
We present the MASSIVE dataset--Multilingual Amazon Slu resource package (SLURP) for
Slot-filling, Intent classification, and Virtual assistant Evaluation. MASSIVE contains 1M …
Slot-filling, Intent classification, and Virtual assistant Evaluation. MASSIVE contains 1M …
Scandeval: A benchmark for Scandinavian natural language processing
DS Nielsen - arXiv preprint arXiv:2304.00906, 2023 - arxiv.org
This paper introduces a Scandinavian benchmarking platform, ScandEval, which can
benchmark any pretrained model on four different tasks in the Scandinavian languages. The …
benchmark any pretrained model on four different tasks in the Scandinavian languages. The …
This is the way: designing and compiling LEPISZCZE, a comprehensive NLP benchmark for Polish
L Augustyniak, K Tagowski… - Advances in …, 2022 - proceedings.neurips.cc
The availability of compute and data to train larger and larger language models increases
the demand for robust methods of benchmarking the true progress of LM training. Recent …
the demand for robust methods of benchmarking the true progress of LM training. Recent …
Superlim: A Swedish language understanding evaluation benchmark
We present Superlim, a multi-task NLP benchmark and analysis platform for evaluating
Swedish language models, a counterpart to the English-language (Super) GLUE suite. We …
Swedish language models, a counterpart to the English-language (Super) GLUE suite. We …
Farstail: A persian natural language inference dataset
With the considerable achievements of data-hungry deep learning methods in natural
language processing tasks, a great amount of effort has been devoted to develop more …
language processing tasks, a great amount of effort has been devoted to develop more …
Persianquad: the native question answering dataset for the Persian language
Developing Question Answering systems (QA) is one of the main goals in Artificial
Intelligence. With the advent of Deep Learning (DL) techniques, QA systems have witnessed …
Intelligence. With the advent of Deep Learning (DL) techniques, QA systems have witnessed …
[HTML][HTML] Investigating the Challenges and Opportunities in Persian Language Information Retrieval through Standardized Data Collections and Deep Learning
The Persian language, also known as Farsi, is distinguished by its intricate morphological
richness, yet it contends with a paucity of linguistic resources. With an estimated 110 million …
richness, yet it contends with a paucity of linguistic resources. With an estimated 110 million …
FaBERT: Pre-training BERT on Persian Blogs
We introduce FaBERT, a Persian BERT-base model pre-trained on the HmBlogs corpus,
encompassing both informal and formal Persian texts. FaBERT is designed to excel in …
encompassing both informal and formal Persian texts. FaBERT is designed to excel in …