A review on multimodal zero‐shot learning
Multimodal learning provides a path to fully utilize all types of information related to the
modeling target to provide the model with a global vision. Zero‐shot learning (ZSL) is a …
modeling target to provide the model with a global vision. Zero‐shot learning (ZSL) is a …
System transparency in shared autonomy: A mini review
V Alonso, P De La Puente - Frontiers in neurorobotics, 2018 - frontiersin.org
What does transparency mean in a shared autonomy framework? Different ways of
understanding system transparency in human-robot interaction can be found in the state of …
understanding system transparency in human-robot interaction can be found in the state of …
A survey on deep reinforcement learning for audio-based applications
Deep reinforcement learning (DRL) is poised to revolutionise the field of artificial intelligence
(AI) by endowing autonomous systems with high levels of understanding of the real world …
(AI) by endowing autonomous systems with high levels of understanding of the real world …
SAC: Semantic attention composition for text-conditioned image retrieval
The ability to efficiently search for images is essential for improving the user experiences
across various products. Incorporating user feedback, via multi-modal inputs, to navigate …
across various products. Incorporating user feedback, via multi-modal inputs, to navigate …
[HTML][HTML] Coarse-to-fine fusion for language grounding in 3D navigation
We present a new network whereby an agent navigates in the 3D environment to find a
target object according to a language-based instruction. Such a task is challenging because …
target object according to a language-based instruction. Such a task is challenging because …
[PDF][PDF] Trace: Transform aggregate and compose visiolinguistic representations for image search with text feedback
The ability to efficiently search for images over an indexed database is the cornerstone for
several user experiences. Incorporating user feedback, through multi-modal inputs provide …
several user experiences. Incorporating user feedback, through multi-modal inputs provide …
Multi-modal association based grouping for form structure extraction
Document structure extraction has been a widely researched area for decades. Recent work
in this direction has been deep learning-based, mostly focusing on extracting structure using …
in this direction has been deep learning-based, mostly focusing on extracting structure using …
[PDF][PDF] Sound-Image Grounding Based Focusing Mechanism for Efficient Automatic Spoken Language Acquisition.
The process of spoken language acquisition based on soundimage grounding has been
one of the topics that has attracted the most significant interest of linguists and human …
one of the topics that has attracted the most significant interest of linguists and human …
Spoken language acquisition based on reinforcement learning and word unit segmentation
S Gao, W Hou, T Tanaka… - ICASSP 2020-2020 IEEE …, 2020 - ieeexplore.ieee.org
The process of spoken-language acquisition has been one of the topics of greatest interest
to linguists for decades. By uti-lizing modern machine learning techniques, we simulated this …
to linguists for decades. By uti-lizing modern machine learning techniques, we simulated this …
DeepPatent2: A Large-Scale Benchmarking Corpus for Technical Drawing Understanding
Recent advances in computer vision (CV) and natural language processing have been
driven by exploiting big data on practical applications. However, these research fields are …
driven by exploiting big data on practical applications. However, these research fields are …