Logiqa 2.0—an improved dataset for logical reasoning in natural language understanding

From google gemini to openai q*(q-star): A survey of reshaping the generative artificial intelligence (ai) research landscape

TR McIntosh, T Susnjak, T Liu, P Watters… - arXiv preprint arXiv …, 2023 - arxiv.org

This comprehensive survey explored the evolving landscape of generative Artificial
Intelligence (AI), with a specific focus on the transformative impacts of Mixture of Experts …

被引用次数：87 相关文章所有 3 个版本

[PDF] arxiv.org

Evaluating large language models: A comprehensive survey

Z Guo, R Jin, C Liu, Y Huang, D Shi, L Yu, Y Liu… - arXiv preprint arXiv …, 2023 - arxiv.org

Large language models (LLMs) have demonstrated remarkable capabilities across a broad
spectrum of tasks. They have attracted significant attention and been deployed in numerous …

被引用次数：80 相关文章所有 2 个版本

[PDF] arxiv.org

Rephrasing the web: A recipe for compute and data-efficient language modeling

P Maini, S Seto, H Bai, D Grangier, Y Zhang… - arXiv preprint arXiv …, 2024 - arxiv.org

Large language models are trained on massive scrapes of the web, which are often
unstructured, noisy, and poorly phrased. Current scaling laws show that learning from such …

被引用次数：18 相关文章所有 4 个版本

A review: Insight into smart and sustainable ultra-precision machining augmented by intelligent IoT

Z Xu, T Zhu, FL Luo, B Zhang, H Poon, WS Yip… - Journal of Manufacturing …, 2024 - Elsevier

Abstract Ultra-precision machining (UPM), which is capable of fabricating micro-components
with less than 0.2 µm forming accuracy and 10 nm surface accuracy, is becoming …

被引用次数：5 相关文章

[PDF] arxiv.org

Seaeval for multilingual foundation models: From cross-lingual alignment to cultural reasoning

B Wang, Z Liu, X Huang, F Jiao, Y Ding, AT Aw… - arXiv preprint arXiv …, 2023 - arxiv.org

We present SeaEval, a benchmark for multilingual foundation models. In addition to
characterizing how these models understand and reason with natural language, we also …

被引用次数：9 相关文章所有 4 个版本

[PDF] arxiv.org

Towards logiglue: A brief survey and a benchmark for analyzing logical reasoning capabilities of language models

M Luo, S Kumbhar, M Parmar, N Varshney… - arXiv preprint arXiv …, 2023 - arxiv.org

Logical reasoning is fundamental for humans yet presents a substantial challenge in the
domain of Artificial Intelligence. Initially, researchers used Knowledge Representation and …

被引用次数：9 相关文章所有 2 个版本

[PDF] arxiv.org

When llms meet cunning questions: A fallacy understanding benchmark for large language models

Y Li, Q Zhou, Y Luo, S Ma, Y Li, HT Zheng, X Hu… - arXiv preprint arXiv …, 2024 - arxiv.org

Recently, Large Language Models (LLMs) have made remarkable evolutions in language
understanding and generation. Following this, various benchmarks for measuring all kinds …

被引用次数：11 相关文章所有 2 个版本

[PDF] arxiv.org

Self-playing Adversarial Language Game Enhances LLM Reasoning

P Cheng, T Hu, H Xu, Z Zhang, Y Dai, L Han… - arXiv preprint arXiv …, 2024 - arxiv.org

We explore the self-play training procedure of large language models (LLMs) in a two-player
adversarial language game called Adversarial Taboo. In this game, an attacker and a …

被引用次数：5 相关文章所有 2 个版本

An embedded end-to-end voice assistant

L Lazzaroni, F Bellotti, R Berta - Engineering Applications of Artificial …, 2024 - Elsevier

Voice assistants are spreading in various environments, such as houses and cars, bringing
the possibility of controlling heterogeneous Internet of Things devices with simple voice …

被引用次数：1 相关文章

[PDF] arxiv.org

Balancing Speciality and Versatility: a Coarse to Fine Framework for Supervised Fine-tuning Large Language Model

H Zhang, Y Wu, D Li, Z Yang, R Zhao, Y Jiang… - arXiv preprint arXiv …, 2024 - arxiv.org

Aligned Large Language Models (LLMs) showcase remarkable versatility, capable of
handling diverse real-world tasks. Meanwhile, aligned LLMs are also expected to exhibit …

被引用次数：5 相关文章所有 2 个版本