Pali: A jointly-scaled multilingual language-image model X Chen, X Wang, S Changpinyo, AJ Piergiovanni, P Padlewski, D Salz, ... arXiv preprint arXiv:2209.06794, 2022 | 502 | 2022 |
Representation flow for action recognition AJ Piergiovanni, MS Ryoo Proceedings of the IEEE/CVF conference on computer vision and pattern …, 2019 | 194 | 2019 |
Evolving losses for unsupervised video representation learning AJ Piergiovanni, A Angelova, MS Ryoo Proceedings of the IEEE/CVF conference on computer vision and pattern …, 2020 | 159 | 2020 |
F-vlm: Open-vocabulary object detection upon frozen vision and language models W Kuo, Y Cui, X Gu, AJ Piergiovanni, A Angelova arXiv preprint arXiv:2209.15639, 2022 | 141 | 2022 |
Tokenlearner: Adaptive space-time tokenization for videos M Ryoo, AJ Piergiovanni, A Arnab, M Dehghani, A Angelova Advances in neural information processing systems 34, 12786-12797, 2021 | 134 | 2021 |
Assemblenet: Searching for multi-stream neural connectivity in video architectures MS Ryoo, AJ Piergiovanni, M Tan, A Angelova arXiv preprint arXiv:1905.13209, 2019 | 114 | 2019 |
Pali-x: On scaling up a multilingual vision and language model X Chen, J Djolonga, P Padlewski, B Mustafa, S Changpinyo, J Wu, ... arXiv preprint arXiv:2305.18565, 2023 | 110 | 2023 |
Tokenlearner: What can 8 learned tokens do for images and videos? MS Ryoo, AJ Piergiovanni, A Arnab, M Dehghani, A Angelova arXiv preprint arXiv:2106.11297, 2021 | 110 | 2021 |
Learning latent super-events to detect multiple activities in videos AJ Piergiovanni, MS Ryoo Proceedings of the IEEE Conference on Computer Vision and Pattern …, 2018 | 106 | 2018 |
Temporal gaussian mixture layer for videos AJ Piergiovanni, M Ryoo International Conference on Machine learning, 5152-5161, 2019 | 102 | 2019 |
Fine-grained activity recognition in baseball videos AJ Piergiovanni, MS Ryoo Proceedings of the ieee conference on computer vision and pattern …, 2018 | 90 | 2018 |
Evolving space-time neural architectures for videos AJ Piergiovanni, A Angelova, A Toshev, MS Ryoo Proceedings of the IEEE/CVF International Conference on Computer Vision …, 2019 | 82 | 2019 |
4d-net for learned multi-modal alignment AJ Piergiovanni, V Casser, MS Ryoo, A Angelova Proceedings of the IEEE/CVF International Conference on Computer Vision …, 2021 | 70 | 2021 |
Learning latent subevents in activity videos using temporal attention filters A Piergiovanni, C Fan, M Ryoo Proceedings of the AAAI Conference on Artificial Intelligence 31 (1), 2017 | 62 | 2017 |
Rethinking video vits: Sparse video tubes for joint image and video learning AJ Piergiovanni, W Kuo, A Angelova Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2023 | 57 | 2023 |
Attentionnas: Spatiotemporal attention cell search for video classification X Wang, X Xiong, M Neumann, AJ Piergiovanni, MS Ryoo, A Angelova, ... Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23 …, 2020 | 53 | 2020 |
Tiny video networks AJ Piergiovanni, A Angelova, MS Ryoo Applied AI Letters 3 (1), e38, 2022 | 51 | 2022 |
Assemblenet++: Assembling modality representations via attention connections MS Ryoo, AJ Piergiovanni, J Kangaspunta, A Angelova Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23 …, 2020 | 48 | 2020 |
Learning real-world robot policies by dreaming AJ Piergiovanni, A Wu, MS Ryoo 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems …, 2019 | 41 | 2019 |
Avid dataset: Anonymized videos from diverse countries AJ Piergiovanni, M Ryoo Advances in Neural Information Processing Systems 33, 16711-16721, 2020 | 40 | 2020 |