- 学术资源搜索

A survey on large language model based autonomous agents

L Wang, C Ma, X Feng, Z Zhang, H Yang… - Frontiers of Computer …, 2024 - Springer

Autonomous agents have long been a research focus in academic and industry
communities. Previous research often focuses on training agents with limited knowledge …

被引用次数：471 相关文章所有 4 个版本

[PDF] acm.org

Foundations & trends in multimodal machine learning: Principles, challenges, and open questions

PP Liang, A Zadeh, LP Morency - ACM Computing Surveys, 2024 - dl.acm.org

Multimodal machine learning is a vibrant multi-disciplinary research field that aims to design
computer agents with intelligent capabilities such as understanding, reasoning, and learning …

被引用次数：17 相关文章

[PDF] neurips.cc

Camel: Communicative agents for" mind" exploration of large language model society

G Li, H Hammoud, H Itani… - Advances in Neural …, 2023 - proceedings.neurips.cc

The rapid advancement of chat-based language models has led to remarkable progress in
complex task-solving. However, their success heavily relies on human input to guide the …

被引用次数：334 相关文章所有 8 个版本

[PDF] openreview.net

[PDF][PDF] Communicative agents for software development

C Qian, X Cong, C Yang, W Chen, Y Su… - arXiv preprint arXiv …, 2023 - openreview.net

Software engineering is a domain characterized by intricate decision-making processes,
often relying on nuanced intuition and consultation. Recent advancements in deep learning …

被引用次数：275 相关文章所有 3 个版本

[PDF] github.io

The rise and potential of large language model based agents: A survey

Z Xi, W Chen, X Guo, W He, Y Ding, B Hong… - arXiv preprint arXiv …, 2023 - arxiv.org

For a long time, humanity has pursued artificial intelligence (AI) equivalent to or surpassing
the human level, with AI agents considered a promising vehicle for this pursuit. AI agents are …

被引用次数：405 相关文章所有 4 个版本

[PDF] arxiv.org

Large language models for information retrieval: A survey

Y Zhu, H Yuan, S Wang, J Liu, W Liu, C Deng… - arXiv preprint arXiv …, 2023 - arxiv.org

As a primary means of information acquisition, information retrieval (IR) systems, such as
search engines, have integrated themselves into our daily lives. These systems also serve …

被引用次数：156 相关文章所有 3 个版本

[PDF] arxiv.org

Cognitive architectures for language agents

TR Sumers, S Yao, K Narasimhan… - arXiv preprint arXiv …, 2023 - arxiv.org

Recent efforts have incorporated large language models (LLMs) with external resources (eg,
the Internet) or internal control flows (eg, prompt chaining) for tasks requiring grounding or …

被引用次数：114 相关文章所有 3 个版本

[PDF] thecvf.com

Dress: Instructing large vision-language models to align and interact with humans via natural language feedback

Y Chen, K Sikka, M Cogswell, H Ji… - Proceedings of the …, 2024 - openaccess.thecvf.com

We present DRESS a large vision language model (LVLM) that innovatively exploits Natural
Language feedback (NLF) from Large Language Models to enhance its alignment and …

被引用次数：28 相关文章所有 3 个版本

[PDF] arxiv.org

Mint: Evaluating llms in multi-turn interaction with tools and language feedback

X Wang, Z Wang, J Liu, Y Chen, L Yuan… - arXiv preprint arXiv …, 2023 - arxiv.org

To solve complex tasks, large language models (LLMs) often require multiple rounds of
interactions with the user, sometimes assisted by external tools. However, current evaluation …

被引用次数：53 相关文章所有 3 个版本

[PDF] arxiv.org

Gpt-4v (ision) is a generalist web agent, if grounded

B Zheng, B Gou, J Kil, H Sun, Y Su - arXiv preprint arXiv:2401.01614, 2024 - arxiv.org

The recent development on large multimodal models (LMMs), especially GPT-4V (ision) and
Gemini, has been quickly expanding the capability boundaries of multimodal models …

被引用次数：57 相关文章所有 4 个版本