SIMMC 2.0: A task-oriented dialog dataset for immersive multimodal conversations
Next generation task-oriented dialog systems need to understand conversational contexts
with their perceived surroundings, to effectively help users in the real-world multimodal …
with their perceived surroundings, to effectively help users in the real-world multimodal …
Zero-shot dialogue state tracking via cross-task transfer
Zero-shot transfer learning for dialogue state tracking (DST) enables us to handle a variety
of task-oriented dialogue domains without the expense of collecting in-domain data. In this …
of task-oriented dialogue domains without the expense of collecting in-domain data. In this …
Overview of the ninth dialog system technology challenge: Dstc9
This paper introduces the Ninth Dialog System Technology Challenge (DSTC-9). This
edition of the DSTC focuses on applying end-to-end dialog technologies for four distinct …
edition of the DSTC focuses on applying end-to-end dialog technologies for four distinct …
State graph reasoning for multimodal conversational recommendation
Conversational recommendation system (CRS) attracts increasing attention in various
application domains such as retail and travel. It offers an effective way to capture users' …
application domains such as retail and travel. It offers an effective way to capture users' …
Bitod: A bilingual multi-domain dataset for task-oriented dialogue modeling
Task-oriented dialogue (ToD) benchmarks provide an important avenue to measure
progress and develop better conversational agents. However, existing datasets for end-to …
progress and develop better conversational agents. However, existing datasets for end-to …
Visual language navigation: A survey and open challenges
SM Park, YG Kim - Artificial Intelligence Review, 2023 - Springer
With the recent development of deep learning, AI models are widely used in various
domains. AI models show good performance for definite tasks such as image classification …
domains. AI models show good performance for definite tasks such as image classification …
Multimodal conversational ai: A survey of datasets and approaches
As humans, we experience the world with all our senses or modalities (sound, sight, touch,
smell, and taste). We use these modalities, particularly sight and touch, to convey and …
smell, and taste). We use these modalities, particularly sight and touch, to convey and …
[PDF][PDF] UniMF: A Unified Framework to Incorporate Multimodal Knowledge Bases intoEnd-to-End Task-Oriented Dialogue Systems.
Abstract Knowledge bases (KBs) are usually essential for building practical dialogue
systems. Recently we have seen rapidly growing interest in integrating knowledge bases …
systems. Recently we have seen rapidly growing interest in integrating knowledge bases …
A Textual Dataset for Situated Proactive Response Selection
Recent data-driven conversational models are able to return fluent, consistent, and
informative responses to many kinds of requests and utterances in task-oriented scenarios …
informative responses to many kinds of requests and utterances in task-oriented scenarios …
SIMMC-VR: A Task-oriented Multimodal Dialog Dataset with Situated and Immersive VR Streams
Building an AI assistant that can seamlessly converse and instruct humans, in a user-centric
situated scenario, requires several essential abilities:(1) spatial and temporal understanding …
situated scenario, requires several essential abilities:(1) spatial and temporal understanding …