Human-guided reinforcement learning with sim-to-real transfer for autonomous navigation
Reinforcement learning (RL) is a promising approach in unmanned ground vehicles (UGVs)
applications, but limited computing resource makes it challenging to deploy a well-behaved …
applications, but limited computing resource makes it challenging to deploy a well-behaved …
Frequency-enhanced data augmentation for vision-and-language navigation
Abstract Vision-and-Language Navigation (VLN) is a challenging task that requires an agent
to navigate through complex environments based on natural language instructions. In …
to navigate through complex environments based on natural language instructions. In …
Multimodal transformer with variable-length memory for vision-and-language navigation
Abstract Vision-and-Language Navigation (VLN) is a task that an agent is required to follow
a language instruction to navigate to the goal position, which relies on the ongoing …
a language instruction to navigate to the goal position, which relies on the ongoing …
Memory-adaptive vision-and-language navigation
Abstract Vision-and-Language Navigation (VLN) requests an agent to navigate in 3D
environments following given instructions, where history is critical for decision-making in …
environments following given instructions, where history is critical for decision-making in …
3d question answering with scene graph reasoning
3DQA has gained considerable attention due to its enhanced spatial understanding
capabilities compared to image-based VQA. However, existing 3DQA methods have …
capabilities compared to image-based VQA. However, existing 3DQA methods have …
Pasts: Progress-aware spatio-temporal transformer speaker for vision-and-language navigation
Vision-and-language navigation (VLN) is a crucial but challenging cross-modal navigation
task. One powerful technique to enhance the generalization performance in VLN is the use …
task. One powerful technique to enhance the generalization performance in VLN is the use …
Incorporating external knowledge reasoning for vision-and-language navigation with assistant's help
X Li, Y Zhang, W Yuan, J Luo - Applied Sciences, 2022 - mdpi.com
Vision-and-Language Navigation (VLN) is a task designed to enable embodied agents carry
out natural language instructions in realistic environments. Most VLN tasks, however, are …
out natural language instructions in realistic environments. Most VLN tasks, however, are …
Data Optimization in Deep Learning: A Survey
O Wu, R Yao - arXiv preprint arXiv:2310.16499, 2023 - arxiv.org
Large-scale, high-quality data are considered an essential factor for the successful
application of many deep learning techniques. Meanwhile, numerous real-world deep …
application of many deep learning techniques. Meanwhile, numerous real-world deep …
DOROTHIE: Spoken dialogue for handling unexpected situations in interactive autonomous driving agents
In the real world, autonomous driving agents navigate in highly dynamic environments full of
unexpected situations where pre-trained models are unreliable. In these situations, what is …
unexpected situations where pre-trained models are unreliable. In these situations, what is …
Heterogeneous Embodied Multi-Agent Collaboration
Multi-agent embodied tasks have been studied in indoor visual environments, but most of
the existing research focuses on homogeneous multi-agent tasks. Heterogeneous multi …
the existing research focuses on homogeneous multi-agent tasks. Heterogeneous multi …