Premise order matters in reasoning with large language models

X Chen, RA Chi, X Wang, D Zhou - arXiv preprint arXiv:2402.08939, 2024 - arxiv.org
Large language models (LLMs) have accomplished remarkable reasoning performance in
various domains. However, in the domain of reasoning tasks, we discover a frailty: LLMs are …

Mitigating Reversal Curse via Semantic-aware Permutation Training

Q Guo, R Wang, J Guo, X Tan, J Bian… - arXiv preprint arXiv …, 2024 - arxiv.org
While large language models (LLMs) have achieved impressive performance across diverse
tasks, recent studies showcase that causal LLMs suffer from the" reversal curse". It is a …

Did Translation Models Get More Robust Without Anyone Even Noticing?

B Peters, AFT Martins - arXiv preprint arXiv:2403.03923, 2024 - arxiv.org
Neural machine translation (MT) models achieve strong results across a variety of settings,
but it is widely believed that they are highly sensitive to" noisy" inputs, such as spelling …

EXCGEC: A Benchmark of Edit-wise Explainable Chinese Grammatical Error Correction

J Ye, S Qin, Y Li, X Cheng, L Qin, HT Zheng… - arXiv preprint arXiv …, 2024 - arxiv.org
Existing studies explore the explainability of Grammatical Error Correction (GEC) in a limited
scenario, where they ignore the interaction between corrections and explanations. To bridge …

LongSkywork: A Training Recipe for Efficiently Extending Context Length in Large Language Models

L Zhao, T Wei, L Zeng, C Cheng, L Yang… - arXiv preprint arXiv …, 2024 - arxiv.org
We introduce LongSkywork, a long-context Large Language Model (LLM) capable of
processing up to 200,000 tokens. We provide a training recipe for efficiently extending …

[PDF][PDF] Digitalisation Workflows in the Age of Transformer Models: A Case Study in Digital Cultural Heritage

M Vafaie, MA Tan, H Sack - 2024 - semdh.github.io
The advent of transformer architecture revolutionised the field of Artificial Intelligence (AI)
and its various applications. It is only recently that digitalisation of cultural heritage data has …