Human-like controllable image captioning with verb-specific semantic roles
Abstract Controllable Image Captioning (CIC)--generating image descriptions following
designated control signals--has received unprecedented attention over the last few years …
designated control signals--has received unprecedented attention over the last few years …
Situation recognition with graph neural networks
We address the problem of recognizing situations in images. Given an image, the task is to
predict the most salient verb (action), and fill its semantic roles such as who is performing the …
predict the most salient verb (action), and fill its semantic roles such as who is performing the …
Grounding'grounding'in NLP
The NLP community has seen substantial recent interest in grounding to facilitate interaction
between language technologies and the world. However, as a community, we use the term …
between language technologies and the world. However, as a community, we use the term …
[PDF][PDF] Language to Action: Towards Interactive Task Learning with Physical Agents.
Abstract Language communication plays an important role in human learning and
knowledge acquisition. With the emergence of a new generation of cognitive robots …
knowledge acquisition. With the emergence of a new generation of cognitive robots …
Finding" it": Weakly-supervised reference-aware visual grounding in instructional videos
Grounding textual phrases in visual content with standalone image-sentence pairs is a
challenging task. When we consider grounding in instructional videos, this problem …
challenging task. When we consider grounding in instructional videos, this problem …
Efficient grounding of abstract spatial concepts for natural language interaction with robot platforms
Our goal is to develop models that allow a robot to efficiently understand or “ground” natural
language instructions in the context of its world representation. Contemporary approaches …
language instructions in the context of its world representation. Contemporary approaches …
Interactive learning of grounded verb semantics towards human-robot communication
L She, J Chai - Proceedings of the 55th Annual Meeting of the …, 2017 - aclanthology.org
To enable human-robot communication and collaboration, previous works represent
grounded verb semantics as the potential change of state to the physical world caused by …
grounded verb semantics as the potential change of state to the physical world caused by …
Collaborative language grounding toward situated human-robot dialogue
To enable situated human-robot dialogue, techniques to support grounded language
communication are essential. One particular challenge is to ground human language to …
communication are essential. One particular challenge is to ground human language to …
Rethinking the two-stage framework for grounded situation recognition
Abstract Grounded Situation Recognition (GSR), ie, recognizing the salient activity (or verb)
category in an image (eg, buying) and detecting all corresponding semantic roles (eg, agent …
category in an image (eg, buying) and detecting all corresponding semantic roles (eg, agent …
Unsupervised visual-linguistic reference resolution in instructional videos
We propose an unsupervised method for reference resolution in instructional videos, where
the goal is to temporally link an entity (eg," dressing") to the action (eg," mix yogurt") that …
the goal is to temporally link an entity (eg," dressing") to the action (eg," mix yogurt") that …