A survey on large language model based autonomous agents
Autonomous agents have long been a research focus in academic and industry
communities. Previous research often focuses on training agents with limited knowledge …
communities. Previous research often focuses on training agents with limited knowledge …
Foundations & trends in multimodal machine learning: Principles, challenges, and open questions
Multimodal machine learning is a vibrant multi-disciplinary research field that aims to design
computer agents with intelligent capabilities such as understanding, reasoning, and learning …
computer agents with intelligent capabilities such as understanding, reasoning, and learning …
Camel: Communicative agents for" mind" exploration of large language model society
The rapid advancement of chat-based language models has led to remarkable progress in
complex task-solving. However, their success heavily relies on human input to guide the …
complex task-solving. However, their success heavily relies on human input to guide the …
[PDF][PDF] Communicative agents for software development
Software engineering is a domain characterized by intricate decision-making processes,
often relying on nuanced intuition and consultation. Recent advancements in deep learning …
often relying on nuanced intuition and consultation. Recent advancements in deep learning …
The rise and potential of large language model based agents: A survey
For a long time, humanity has pursued artificial intelligence (AI) equivalent to or surpassing
the human level, with AI agents considered a promising vehicle for this pursuit. AI agents are …
the human level, with AI agents considered a promising vehicle for this pursuit. AI agents are …
Large language models for information retrieval: A survey
As a primary means of information acquisition, information retrieval (IR) systems, such as
search engines, have integrated themselves into our daily lives. These systems also serve …
search engines, have integrated themselves into our daily lives. These systems also serve …
Cognitive architectures for language agents
Recent efforts have incorporated large language models (LLMs) with external resources (eg,
the Internet) or internal control flows (eg, prompt chaining) for tasks requiring grounding or …
the Internet) or internal control flows (eg, prompt chaining) for tasks requiring grounding or …
Dress: Instructing large vision-language models to align and interact with humans via natural language feedback
We present DRESS a large vision language model (LVLM) that innovatively exploits Natural
Language feedback (NLF) from Large Language Models to enhance its alignment and …
Language feedback (NLF) from Large Language Models to enhance its alignment and …
Mint: Evaluating llms in multi-turn interaction with tools and language feedback
To solve complex tasks, large language models (LLMs) often require multiple rounds of
interactions with the user, sometimes assisted by external tools. However, current evaluation …
interactions with the user, sometimes assisted by external tools. However, current evaluation …
Gpt-4v (ision) is a generalist web agent, if grounded
The recent development on large multimodal models (LMMs), especially GPT-4V (ision) and
Gemini, has been quickly expanding the capability boundaries of multimodal models …
Gemini, has been quickly expanding the capability boundaries of multimodal models …