AssistGUI: Task-Oriented PC Graphical User Interface Automation
Abstract Graphical User Interface (GUI) automation holds significant promise for assisting
users with complex tasks thereby boosting human productivity. Existing works leveraging …
users with complex tasks thereby boosting human productivity. Existing works leveraging …
LAVE: LLM-Powered Agent Assistance and Language Augmentation for Video Editing
Video creation has become increasingly popular, yet the expertise and effort required for
editing often pose barriers to beginners. In this paper, we explore the integration of large …
editing often pose barriers to beginners. In this paper, we explore the integration of large …
AI Assistance for UX: A Literature Review Through Human-Centered AI
Recent advancements in HCI and AI research attempt to support user experience (UX)
practitioners with AI-enabled tools. Despite the potential of emerging models and new …
practitioners with AI-enabled tools. Despite the potential of emerging models and new …
Assistgui: Task-oriented desktop graphical user interface automation
Graphical User Interface (GUI) automation holds significant promise for assisting users with
complex tasks, thereby boosting human productivity. Existing works leveraging Large …
complex tasks, thereby boosting human productivity. Existing works leveraging Large …
AndroidWorld: A dynamic benchmarking environment for autonomous agents
C Rawles, S Clinckemaillie, Y Chang, J Waltz… - arXiv preprint arXiv …, 2024 - arxiv.org
Autonomous agents that execute human tasks by controlling computers can enhance
human productivity and application accessibility. Yet, progress in this field will be driven by …
human productivity and application accessibility. Yet, progress in this field will be driven by …
AutoTask: Executing Arbitrary Voice Commands by Exploring and Learning from Mobile GUI
L Pan, B Wang, C Yu, Y Chen, X Zhang… - arXiv preprint arXiv …, 2023 - arxiv.org
Voice command interfaces (VCIs) have gained increasing importance, enabling hands-free
and eyes-free interaction with digital devices. However, the inherent complexity in …
and eyes-free interaction with digital devices. However, the inherent complexity in …
E-ANT: A Large-Scale Dataset for Efficient Automatic GUI NavigaTion
Online GUI navigation on mobile devices has driven a lot of attention recent years since it
contributes to many real-world applications. With the rapid development of large language …
contributes to many real-world applications. With the rapid development of large language …
Devil's Advocate: Anticipatory Reflection for LLM Agents
In this work, we introduce a novel approach that equips LLM agents with introspection,
enhancing consistency and adaptability in solving complex tasks. Our approach prompts …
enhancing consistency and adaptability in solving complex tasks. Our approach prompts …