Ego-exo4d: Understanding skilled human activity from first-and third-person perspectives K Grauman, A Westbury, L Torresani, K Kitani, J Malik, T Afouras, ... Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2024 | 69 | 2024 |
VideoLLM-online: Online Video Large Language Model for Streaming Video J Chen, Z Lv, S Wu, KQ Lin, C Song, D Gao, JW Liu, Z Gao, D Mao, ... Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2024 | 6 | 2024 |
GazeVQA: A Video Question Answering Dataset for Multiview Eye-Gaze Task-Oriented Collaborations M Ilaslan, C Song, J Chen, D Gao, W Lei, Q Xu, J Lim, M Shou Proceedings of the 2023 Conference on Empirical Methods in Natural Language …, 2023 | 5 | 2023 |