Surveying the mllm landscape: A meta-review of current surveys

M Li, K Chen, Z Bi, M Liu, B Peng, Q Niu, J Liu… - arXiv preprint arXiv …, 2024 - arxiv.org
The rise of Multimodal Large Language Models (MLLMs) has become a transformative force
in the field of artificial intelligence, enabling machines to process and generate content …

Towards general computer control: A multimodal agent for red dead redemption ii as a case study

W Tan, Z Ding, W Zhang, B Li, B Zhou… - ICLR 2024 Workshop …, 2024 - openreview.net
Despite the success in specific tasks and scenarios, existing foundation agents, empowered
by large models (LMs) and advanced tools, still cannot generalize to different scenarios …

Multi-modal and multi-agent systems meet rationality: A survey

B Jiang, Y Xie, X Wang, WJ Su, CJ Taylor… - ICML 2024 Workshop …, 2024 - openreview.net
Rationality is characterized by logical thinking and decision-making that align with evidence
and logical rules. This quality is essential for effective problem-solving, as it ensures that …

Ing-vp: Mllms cannot play easy vision-based games yet

H Zhang, H Guo, S Guo, M Cao, W Huang, J Liu… - arXiv preprint arXiv …, 2024 - arxiv.org
As multimodal large language models (MLLMs) continue to demonstrate increasingly
competitive performance across a broad spectrum of tasks, more intricate and …

Physgame: Uncovering physical commonsense violations in gameplay videos

M Cao, H Tang, H Zhao, H Guo, J Liu, G Zhang… - arXiv preprint arXiv …, 2024 - arxiv.org
Recent advancements in video-based large language models (Video LLMs) have witnessed
the emergence of diverse capabilities to reason and interpret dynamic visual content …

LLM as a Mastermind: A Survey of Strategic Reasoning with Large Language Models

Y Zhang, S Mao, T Ge, X Wang, A de Wynter… - arXiv preprint arXiv …, 2024 - arxiv.org
This paper presents a comprehensive survey of the current status and opportunities for
Large Language Models (LLMs) in strategic reasoning, a sophisticated form of reasoning …

On large language models in national security applications

WN Caballero, PR Jenkins - arXiv preprint arXiv:2407.03453, 2024 - arxiv.org
The overwhelming success of GPT-4 in early 2023 highlighted the transformative potential of
large language models (LLMs) across various sectors, including national security. This …

Strago: Harnessing strategic guidance for prompt optimization

Y Wu, Y Gao, BB Zhu, Z Zhou, X Sun, S Yang… - arXiv preprint arXiv …, 2024 - arxiv.org
Prompt engineering is pivotal for harnessing the capabilities of large language models
(LLMs) across diverse applications. While existing prompt optimization methods improve …

Large Model Agents: State-of-the-Art, Cooperation Paradigms, Security and Privacy, and Future Trends

Y Wang, Y Pan, Q Zhao, Y Deng, Z Su, L Du… - arXiv preprint arXiv …, 2024 - arxiv.org
Large Model (LM) agents, powered by large foundation models such as GPT-4 and DALL-E
2, represent a significant step towards achieving Artificial General Intelligence (AGI). LM …

Odyssey: Empowering Minecraft Agents with Open-World Skills

S Liu, Y Li, K Zhang, Z Cui, W Fang, Y Zheng… - arXiv preprint arXiv …, 2024 - arxiv.org
Recent studies have delved into constructing generalist agents for open-world environments
like Minecraft. Despite the encouraging results, existing efforts mainly focus on solving basic …