Augmented language models: a survey

G Mialon, R Dessì, M Lomeli, C Nalmpantis… - arXiv preprint arXiv …, 2023 - arxiv.org
This survey reviews works in which language models (LMs) are augmented with reasoning
skills and the ability to use tools. The former is defined as decomposing a potentially …

Webarena: A realistic web environment for building autonomous agents

S Zhou, FF Xu, H Zhu, X Zhou, R Lo, A Sridhar… - arXiv preprint arXiv …, 2023 - arxiv.org
With advances in generative AI, there is now potential for autonomous agents to manage
daily tasks via natural language commands. However, current agents are primarily created …

Agentbench: Evaluating llms as agents

X Liu, H Yu, H Zhang, Y Xu, X Lei, H Lai, Y Gu… - arXiv preprint arXiv …, 2023 - arxiv.org
Large Language Models (LLMs) are becoming increasingly smart and autonomous,
targeting real-world pragmatic missions beyond traditional NLP tasks. As a result, there has …

Webshop: Towards scalable real-world web interaction with grounded language agents

S Yao, H Chen, J Yang… - Advances in Neural …, 2022 - proceedings.neurips.cc
Most existing benchmarks for grounding language in interactive environments either lack
realistic linguistic elements, or prove difficult to scale up due to substantial human …

A data-driven approach for learning to control computers

PC Humphreys, D Raposo, T Pohlen… - International …, 2022 - proceedings.mlr.press
It would be useful for machines to use computers as humans do so that they can aid us in
everyday tasks. This is a setting in which there is also the potential to leverage large-scale …

Understanding html with large language models

I Gur, O Nachum, Y Miao, M Safdari, A Huang… - arXiv preprint arXiv …, 2022 - arxiv.org
Large language models (LLMs) have shown exceptional performance on a variety of natural
language tasks. Yet, their capabilities for HTML understanding--ie, parsing the raw HTML of …

Personal llm agents: Insights and survey about the capability, efficiency and security

Y Li, H Wen, W Wang, X Li, Y Yuan, G Liu, J Liu… - arXiv preprint arXiv …, 2024 - arxiv.org
Since the advent of personal computing devices, intelligent personal assistants (IPAs) have
been one of the key technologies that researchers and engineers have focused on, aiming …

AssistGUI: Task-Oriented PC Graphical User Interface Automation

D Gao, L Ji, Z Bai, M Ouyang, P Li… - Proceedings of the …, 2024 - openaccess.thecvf.com
Abstract Graphical User Interface (GUI) automation holds significant promise for assisting
users with complex tasks thereby boosting human productivity. Existing works leveraging …

Never-ending learning of user interfaces

J Wu, R Krosnick, E Schoop, A Swearngin… - Proceedings of the 36th …, 2023 - dl.acm.org
Machine learning models have been trained to predict semantic information about user
interfaces (UIs) to make apps more accessible, easier to test, and to automate. Currently …

Synapse: Trajectory-as-exemplar prompting with memory for computer control

L Zheng, R Wang, X Wang, B An - The Twelfth International …, 2023 - openreview.net
Building agents with large language models (LLMs) for computer control is a burgeoning
research area, where the agent receives computer states and performs actions to complete …