AirLetters: An Open Video Dataset of Characters Drawn in the Air
We introduce AirLetters, a new video dataset consisting of real-world videos of human-
generated, articulated motions. Specifically, our dataset requires a vision model to predict …
generated, articulated motions. Specifically, our dataset requires a vision model to predict …
Live Fitness Coaching as a Testbed for Situated Interaction
Tasks at the intersection of vision and language have had a profound impact in advancing
the capabilities of vision-language models such as dialog-based assistants. However …
the capabilities of vision-language models such as dialog-based assistants. However …