- 学术资源搜索

Foundations & trends in multimodal machine learning: Principles, challenges, and open questions

PP Liang, A Zadeh, LP Morency - ACM Computing Surveys, 2024 - dl.acm.org

Multimodal machine learning is a vibrant multi-disciplinary research field that aims to design
computer agents with intelligent capabilities such as understanding, reasoning, and learning …

被引用次数：55 相关文章

[PDF] arxiv.org

Survey on large language model-enhanced reinforcement learning: Concept, taxonomy, and methods

Y Cao, H Zhao, Y Cheng, T Shu, Y Chen… - … on Neural Networks …, 2024 - ieeexplore.ieee.org

With extensive pretrained knowledge and high-level general capabilities, large language
models (LLMs) emerge as a promising avenue to augment reinforcement learning (RL) in …

被引用次数：30 相关文章所有 2 个版本

[PDF] arxiv.org

Palm-e: An embodied multimodal language model

D Driess, F Xia, MSM Sajjadi, C Lynch… - arXiv preprint arXiv …, 2023 - arxiv.org

Large language models excel at a wide range of complex tasks. However, enabling general
inference in the real world, eg, for robotics problems, raises the challenge of grounding. We …

被引用次数：1498 相关文章所有 6 个版本

[PDF] neurips.cc

Camel: Communicative agents for" mind" exploration of large language model society

G Li, H Hammoud, H Itani… - Advances in Neural …, 2023 - proceedings.neurips.cc

The rapid advancement of chat-based language models has led to remarkable progress in
complex task-solving. However, their success heavily relies on human input to guide the …

被引用次数：497 相关文章所有 8 个版本

[PDF] arxiv.org

Voxposer: Composable 3d value maps for robotic manipulation with language models

W Huang, C Wang, R Zhang, Y Li, J Wu… - arXiv preprint arXiv …, 2023 - arxiv.org

Large language models (LLMs) are shown to possess a wealth of actionable knowledge that
can be extracted for robot manipulation in the form of reasoning and planning. Despite the …

被引用次数：419 相关文章所有 6 个版本

[PDF] arxiv.org

React: Synergizing reasoning and acting in language models

S Yao, J Zhao, D Yu, N Du, I Shafran… - arXiv preprint arXiv …, 2022 - arxiv.org

While large language models (LLMs) have demonstrated impressive capabilities across
tasks in language understanding and interactive decision making, their abilities for …

被引用次数：1926 相关文章所有 6 个版本

[PDF] arxiv.org

Progprompt: Generating situated robot task plans using large language models

I Singh, V Blukis, A Mousavian, A Goyal… - … on Robotics and …, 2023 - ieeexplore.ieee.org

Task planning can require defining myriad domain knowledge about the world in which a
robot needs to act. To ameliorate that effort, large language models (LLMs) can be used to …

被引用次数：665 相关文章所有 5 个版本

[PDF] arxiv.org

Inner monologue: Embodied reasoning through planning with language models

W Huang, F Xia, T Xiao, H Chan, J Liang… - arXiv preprint arXiv …, 2022 - arxiv.org

Recent works have shown how the reasoning capabilities of Large Language Models
(LLMs) can be applied to domains beyond natural language processing, such as planning …

被引用次数：833 相关文章所有 5 个版本

[PDF] arxiv.org

A generalist agent

S Reed, K Zolna, E Parisotto, SG Colmenarejo… - arXiv preprint arXiv …, 2022 - arxiv.org

Inspired by progress in large-scale language modeling, we apply a similar approach
towards building a single generalist agent beyond the realm of text outputs. The agent …

被引用次数：955 相关文章所有 4 个版本

[PDF] neurips.cc

Toolkengpt: Augmenting frozen language models with massive tools via tool embeddings

S Hao, T Liu, Z Wang, Z Hu - Advances in neural …, 2023 - proceedings.neurips.cc

Integrating large language models (LLMs) with various tools has led to increased attention
in the field. Existing approaches either involve fine-tuning the LLM, which is both …

被引用次数：116 相关文章所有 8 个版本