AirLetters: An Open Video Dataset of Characters Drawn in the Air

R Dagli, G Berger, J Materzynska, I Bax… - arXiv preprint arXiv …, 2024 - arxiv.org
We introduce AirLetters, a new video dataset consisting of real-world videos of human-
generated, articulated motions. Specifically, our dataset requires a vision model to predict …

Live Fitness Coaching as a Testbed for Situated Interaction

S Panchal, A Bhattacharyya, G Berger… - arXiv preprint arXiv …, 2024 - arxiv.org
Tasks at the intersection of vision and language have had a profound impact in advancing
the capabilities of vision-language models such as dialog-based assistants. However …