Natural language reasoning, a survey

F Yu, H Zhang, P Tiwari, B Wang - ACM Computing Surveys, 2024 - dl.acm.org
This survey article proposes a clearer view of Natural Language Reasoning (NLR) in the
field of Natural Language Processing (NLP), both conceptually and practically …

XTREME-R: Towards more challenging and nuanced multilingual evaluation

S Ruder, N Constant, J Botha, A Siddhant… - arXiv preprint arXiv …, 2021 - arxiv.org
Machine learning has brought striking advances in multilingual natural language processing
capabilities over the past year. For example, the latest techniques have improved the state …

Massive: A 1m-example multilingual natural language understanding dataset with 51 typologically-diverse languages

J FitzGerald, C Hench, C Peris, S Mackie… - arXiv preprint arXiv …, 2022 - arxiv.org
We present the MASSIVE dataset--Multilingual Amazon Slu resource package (SLURP) for
Slot-filling, Intent classification, and Virtual assistant Evaluation. MASSIVE contains 1M …

Scandeval: A benchmark for Scandinavian natural language processing

DS Nielsen - arXiv preprint arXiv:2304.00906, 2023 - arxiv.org
This paper introduces a Scandinavian benchmarking platform, ScandEval, which can
benchmark any pretrained model on four different tasks in the Scandinavian languages. The …

This is the way: designing and compiling LEPISZCZE, a comprehensive NLP benchmark for Polish

L Augustyniak, K Tagowski… - Advances in …, 2022 - proceedings.neurips.cc
The availability of compute and data to train larger and larger language models increases
the demand for robust methods of benchmarking the true progress of LM training. Recent …

Superlim: A Swedish language understanding evaluation benchmark

A Berdičevskis, G Bouma, R Kurtz… - Proceedings of the …, 2023 - aclanthology.org
We present Superlim, a multi-task NLP benchmark and analysis platform for evaluating
Swedish language models, a counterpart to the English-language (Super) GLUE suite. We …

Farstail: A persian natural language inference dataset

H Amirkhani, M AzariJafari, S Faridan-Jahromi… - Soft Computing, 2023 - Springer
With the considerable achievements of data-hungry deep learning methods in natural
language processing tasks, a great amount of effort has been devoted to develop more …

Persianquad: the native question answering dataset for the Persian language

A Kazemi, J Mozafari, MA Nematbakhsh - IEEE Access, 2022 - ieeexplore.ieee.org
Developing Question Answering systems (QA) is one of the main goals in Artificial
Intelligence. With the advent of Deep Learning (DL) techniques, QA systems have witnessed …

[HTML][HTML] Investigating the Challenges and Opportunities in Persian Language Information Retrieval through Standardized Data Collections and Deep Learning

S Moniri, T Schlosser, D Kowerko - Computers, 2024 - mdpi.com
The Persian language, also known as Farsi, is distinguished by its intricate morphological
richness, yet it contends with a paucity of linguistic resources. With an estimated 110 million …

FaBERT: Pre-training BERT on Persian Blogs

M Masumi, SS Majd, M Shamsfard, H Beigy - arXiv preprint arXiv …, 2024 - arxiv.org
We introduce FaBERT, a Persian BERT-base model pre-trained on the HmBlogs corpus,
encompassing both informal and formal Persian texts. FaBERT is designed to excel in …