Deberta: Decoding-enhanced bert with disentangled attention

J Li, A Dada, B Puladi, J Kleesiek, J Egger - Computer Methods and …, 2024 - Elsevier

The recent release of ChatGPT, a chat bot research project/product of natural language
processing (NLP) by OpenAI, stirs up a sensation among both the general public and …

被引用次数：175 相关文章所有 8 个版本

[HTML] sciencedirect.com

[HTML][HTML] Pre-trained language models and their applications

H Wang, J Li, H Wu, E Hovy, Y Sun - Engineering, 2023 - Elsevier

Pre-trained language models have achieved striking success in natural language
processing (NLP), leading to a paradigm shift from supervised learning to pre-training …

被引用次数：197 相关文章所有 2 个版本

[PDF] arxiv.org

Llama 2: Open foundation and fine-tuned chat models

H Touvron, L Martin, K Stone, P Albert… - arXiv preprint arXiv …, 2023 - arxiv.org

In this work, we develop and release Llama 2, a collection of pretrained and fine-tuned large
language models (LLMs) ranging in scale from 7 billion to 70 billion parameters. Our fine …

被引用次数：8266 相关文章所有 2 个版本

[PDF] arxiv.org

Holistic evaluation of language models

P Liang, R Bommasani, T Lee, D Tsipras… - arXiv preprint arXiv …, 2022 - arxiv.org

Language models (LMs) are becoming the foundation for almost all major language
technologies, but their capabilities, limitations, and risks are not well understood. We present …

被引用次数：934 相关文章所有 5 个版本

[PDF] aclanthology.org

Is ChatGPT a general-purpose natural language processing task solver?

C Qin, A Zhang, Z Zhang, J Chen, M Yasunaga… - arXiv preprint arXiv …, 2023 - arxiv.org

Spurred by advancements in scale, large language models (LLMs) have demonstrated the
ability to perform a variety of natural language processing (NLP) tasks zero-shot--ie, without …

被引用次数：614 相关文章所有 4 个版本

[PDF] neurips.cc

Mind2web: Towards a generalist agent for the web

X Deng, Y Gu, B Zheng, S Chen… - Advances in …, 2024 - proceedings.neurips.cc

Abstract We introduce Mind2Web, the first dataset for developing and evaluating generalist
agents for the web that can follow language instructions to complete complex tasks on any …

被引用次数：190 相关文章所有 6 个版本

[PDF] openreview.net

Unified-io: A unified model for vision, language, and multi-modal tasks

J Lu, C Clark, R Zellers, R Mottaghi… - The Eleventh …, 2022 - openreview.net

We propose Unified-IO, a model that performs a large variety of AI tasks spanning classical
computer vision tasks, including pose estimation, object detection, depth estimation and …

被引用次数：345 相关文章所有 3 个版本

[PDF] arxiv.org

Exploring the potential of large language models (llms) in learning on graphs

Z Chen, H Mao, H Li, W Jin, H Wen, X Wei… - ACM SIGKDD …, 2024 - dl.acm.org

Learning on Graphs has attracted immense attention due to its wide real-world applications.
The most popular pipeline for learning on graphs with textual node attributes primarily relies …

被引用次数：186 相关文章所有 9 个版本

[PDF] aclanthology.org

Making language models better reasoners with step-aware verifier

Y Li, Z Lin, S Zhang, Q Fu, B Chen… - Proceedings of the …, 2023 - aclanthology.org

Few-shot learning is a challenging task that requires language models to generalize from
limited examples. Large language models like GPT-3 and PaLM have made impressive …

被引用次数：118 相关文章所有 2 个版本

[PDF] arxiv.org

Unnatural instructions: Tuning language models with (almost) no human labor

O Honovich, T Scialom, O Levy, T Schick - arXiv preprint arXiv:2212.09689, 2022 - arxiv.org

Instruction tuning enables pretrained language models to perform new tasks from inference-
time natural language descriptions. These approaches rely on vast amounts of human …

被引用次数：246 相关文章所有 5 个版本