JGLUE: Japanese general language understanding evaluation

Z Guo, R Jin, C Liu, Y Huang, D Shi, L Yu, Y Liu… - arXiv preprint arXiv …, 2023 - arxiv.org

Large language models (LLMs) have demonstrated remarkable capabilities across a broad
spectrum of tasks. They have attracted significant attention and been deployed in numerous …

被引用次数：78 相关文章所有 2 个版本

[PDF] arxiv.org

Evaluating gender bias of pre-trained language models in natural language inference by considering all labels

P Anantaprayoon, M Kaneko, N Okazaki - arXiv preprint arXiv:2309.09697, 2023 - arxiv.org

Discriminatory social biases, including gender biases, have been found in Pre-trained
Language Models (PLMs). In Natural Language Inference (NLI), recent bias evaluation …

被引用次数：9 相关文章所有 3 个版本

[PDF] arxiv.org

Dumb: A benchmark for smart evaluation of dutch models

W de Vries, M Wieling, M Nissim - arXiv preprint arXiv:2305.13026, 2023 - arxiv.org

We introduce the Dutch Model Benchmark: DUMB. The benchmark includes a diverse set of
datasets for low-, medium-and high-resource tasks. The total set of nine tasks includes four …

被引用次数：8 相关文章所有 5 个版本

[PDF] arxiv.org

From base to conversational: Japanese instruction dataset and tuning large language models

M Suzuki, M Hirano, H Sakaji - 2023 IEEE International …, 2023 - ieeexplore.ieee.org

Instruction tuning is essential for large language models (LLMs) to become interactive. While
many instruction tuning datasets exist in English, there is a noticeable lack in other …

被引用次数：9 相关文章所有 6 个版本

A survey of deep learning techniques for machine reading comprehension

S Kazi, S Khoja, A Daud - Artificial Intelligence Review, 2023 - Springer

Reading comprehension involves the process of reading and understanding textual
information in order to answer questions related to it. It finds practical applications in various …

被引用次数：5 相关文章所有 2 个版本

[PDF] arxiv.org

Hae-rae bench: Evaluation of korean knowledge in language models

G Son, H Lee, S Kim, H Kim, J Lee, JW Yeom… - arXiv preprint arXiv …, 2023 - arxiv.org

Large Language Models (LLMs) trained on massive corpora demonstrate impressive
capabilities in a wide range of tasks. While there are ongoing efforts to adapt these models …

被引用次数：3 相关文章所有 4 个版本

[PDF] arxiv.org

JCoLA: Japanese Corpus of Linguistic Acceptability

T Someya, Y Sugimoto, Y Oseki - arXiv preprint arXiv:2309.12676, 2023 - arxiv.org

Neural language models have exhibited outstanding performance in a range of downstream
tasks. However, there is limited understanding regarding the extent to which these models …

被引用次数：4 相关文章所有 3 个版本

[PDF] arxiv.org

Jacolbert and hard negatives, towards better japanese-first embeddings for retrieval: Early technical report

B Clavié - arXiv preprint arXiv:2312.16144, 2023 - arxiv.org

Document retrieval in many languages has been largely relying on multi-lingual models,
and leveraging the vast wealth of English training data. In Japanese, the best performing …

被引用次数：5 相关文章所有 3 个版本

[PDF] arxiv.org

Japanese SimCSE technical report

H Tsukagoshi, R Sasano, K Takeda - arXiv preprint arXiv:2310.19349, 2023 - arxiv.org

We report the development of Japanese SimCSE, Japanese sentence embedding models
fine-tuned with SimCSE. Since there is a lack of sentence embedding models for Japanese …

被引用次数：5 相关文章所有 2 个版本

[PDF] arxiv.org

Aurora-m: The first open source multilingual language model red-teamed according to the us executive order

T Nakamura, M Mishra, S Tedeschi, Y Chai… - arXiv preprint arXiv …, 2024 - arxiv.org

Pretrained language models underpin several AI applications, but their high computational
cost for training limits accessibility. Initiatives such as BLOOM and StarCoder aim to …

被引用次数：3 相关文章所有 2 个版本