On the opportunities and risks of foundation models R Bommasani, DA Hudson, E Adeli, R Altman, S Arora, S von Arx, ... arXiv preprint arXiv:2108.07258, 2021 | 3236 | 2021 |
SST: Single-stream temporal action proposals S Buch, V Escorcia, C Shen, B Ghanem, J Carlos Niebles Proceedings of the IEEE conference on Computer Vision and Pattern …, 2017 | 502 | 2017 |
End-to-end, single-stream temporal action detection in untrimmed videos S Buch, V Escorcia, B Ghanem, L Fei-Fei, JC Niebles Proceedings of the British Machine Vision Conference (BMVC), 2017 | 267 | 2017 |
iGibson, a Simulation Environment for Interactive Tasks in Large Realistic Scenes B Shen*, F Xia*, C Li*, R Martín-Martín*, L Fan, G Wang, S Buch, ... arXiv preprint arXiv:2012.02924, 2020 | 137 | 2020 |
Revisiting the "Video" in Video-Language Understanding S Buch, C Eyzaguirre, A Gaidon, J Wu, L Fei-Fei, JC Niebles Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2022 | 120 | 2022 |
Behavior: Benchmark for everyday household activities in virtual, interactive, and ecological environments S Srivastava, C Li, M Lingelbach, R Martín-Martín, F Xia, KE Vainio, Z Lian, ... Conference on robot learning, 477-490, 2022 | 117 | 2022 |
Finding "It": Weakly-Supervised Reference-Aware Visual Grounding in Instructional Videos DA Huang*, S Buch*, L Dery, A Garg, L Fei-Fei, J Carlos Niebles Proceedings of the IEEE Conference on Computer Vision and Pattern …, 2018 | 101 | 2018 |
On the opportunities and risks of foundation models. arXiv R Bommasani, DA Hudson, E Adeli, R Altman, S Arora, S von Arx, ... arXiv preprint arXiv:2108.07258, 2021 | 77 | 2021 |
RubiksNet: Learnable 3D-Shift for Efficient Video Action Recognition L Fan*, S Buch*, G Wang, R Cao, Y Zhu, JC Niebles, L Fei-Fei Proceedings of the European Conference on Computer Vision (ECCV), 2020 | 74 | 2020 |
On the opportunities and risks of foundation models. arXiv 2021 R Bommasani, DA Hudson, E Adeli, R Altman, S Arora, S von Arx, ... arXiv preprint arXiv:2108.07258, 2023 | 67 | 2023 |
The activitynet large-scale activity recognition challenge 2018 summary B Ghanem, JC Niebles, C Snoek, FC Heilbron, H Alwassel, V Escorcia, ... arXiv preprint arXiv:1808.03766, 2018 | 66 | 2018 |
Activitynet challenge 2017 summary B Ghanem, JC Niebles, C Snoek, FC Heilbron, H Alwassel, R Khrisna, ... arXiv preprint arXiv:1710.08011, 2017 | 59 | 2017 |
On the opportunities and risks of foundation models (arXiv: 2108.07258). arXiv R Bommasani, DA Hudson, E Adeli, R Altman, S Arora, S von Arx, ... | 55 | 2022 |
End-to-end joint semantic segmentation of actors and actions in video J Ji, S Buch, A Soto, JC Niebles Proceedings of the European Conference on Computer Vision (ECCV), 702-717, 2018 | 50 | 2018 |
System and method for leveraging end-to-end driving models for improving driving task modules SD Buch, AD Gaidon US Patent 10,866,588, 2020 | 21 | 2020 |
Neural event semantics for grounded language understanding S Buch, L Fei-Fei, ND Goodman Transactions of the Association for Computational Linguistics 9, 875-890, 2021 | 8 | 2021 |
Streaming dense video captioning X Zhou, A Arnab, S Buch, S Yan, A Myers, X Xiong, A Nagrani, C Schmid Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2024 | 4 | 2024 |
MoReVQA: Exploring Modular Reasoning Models for Video Question Answering J Min, S Buch, A Nagrani, M Cho, C Schmid Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2024 | 1 | 2024 |
Efficient Event Understanding in Videos and Language SD Buch Stanford University, 2022 | | 2022 |
Language identification and accent variation detection in spoken language recordings S Buch, J Gauthier, A Tsang | | |