Speed/accuracy Trade-offs for Modern Convolutional Object Detectors J Huang, V Rathod, C Sun, M Zhu, A Korattikara, A Fathi, I Fischer, ... CVPR 2017, 2017 | 3455 | 2017 |
OpenImages: A public dataset for large-scale multi-label and multi-class image classification I Krasin, T Duerig, N Alldrin, A Veit, S Abu-El-Haija, S Belongie, D Cai, ... github.com/openimages, 2016 | 3257* | 2016 |
Revisiting Unreasonable Effectiveness of Data in Deep Learning Era C Sun, A Shrivastava, S Singh, A Gupta ICCV 2017, 2017 | 2845 | 2017 |
ViViT: A Video Vision Transformer A Arnab, M Dehghani, G Heigold, C Sun, M Lučić, C Schmid ICCV 2021, 2021 | 1993 | 2021 |
Rethinking Spatiotemporal Feature Learning For Video Understanding S Xie, C Sun, J Huang, Z Tu, K Murphy ECCV 2018, 2018 | 1786* | 2018 |
The iNaturalist species classification and detection dataset G Van Horn, O Mac Aodha, Y Song, Y Cui, C Sun, A Shepard, H Adam, ... CVPR 2018, 2018 | 1562* | 2018 |
VideoBERT: A Joint Model for Video and Language Representation Learning C Sun, A Myers, C Vondrick, K Murphy, C Schmid ICCV 2019, 2019 | 1330 | 2019 |
What makes for good views for contrastive learning Y Tian, C Sun, B Poole, D Krishnan, C Schmid, P Isola NeurIPS 2020, 2020 | 1249 | 2020 |
AVA: A video dataset of spatio-temporally localized atomic visual actions C Gu, C Sun, DA Ross, C Vondrick, C Pantofaru, Y Li, ... CVPR 2018, 2018 | 1107 | 2018 |
TALL: Temporal Activity Localization via Language Query J Gao, C Sun, Z Yang, R Nevatia ICCV 2017, 2017 | 740 | 2017 |
VectorNet: Encoding HD Maps and Agent Dynamics from Vectorized Representation J Gao*, C Sun*, H Zhao, Y Shen, D Anguelov, C Li, C Schmid CVPR 2020, 2020 | 713 | 2020 |
Multi-modal Transformer for Video Retrieval V Gabeur, C Sun, K Alahari, C Schmid ECCV 2020, 2020 | 634 | 2020 |
Large Scale Fine-Grained Categorization and Domain-Specific Transfer Learning Y Cui, Y Song, C Sun, A Howard, S Belongie CVPR 2018, 2018 | 561 | 2018 |
TURN TAP: Temporal Unit Regression Network for Temporal Action Proposals J Gao, Z Yang, C Sun, K Chen, R Nevatia ICCV 2017, 2017 | 532 | 2017 |
Attention Bottlenecks for Multimodal Fusion A Nagrani, S Yang, A Arnab, A Jansen, C Schmid, C Sun NeurIPS 2021, 2021 | 508 | 2021 |
TNT: Target-driveN Trajectory Prediction H Zhao, J Gao, T Lan, C Sun, B Sapp, B Varadarajan, Y Shen, Y Shen, ... CoRL 2020, 2020 | 461 | 2020 |
DenseTNT: End-to-end Trajectory Prediction from Dense Goal Sets J Gu, C Sun, H Zhao ICCV 2021, 2021 | 350 | 2021 |
Composing Text and Image for Image Retrieval-An Empirical Odyssey N Vo, L Jiang, C Sun, K Murphy, LJ Li, L Fei-Fei, J Hays CVPR 2019, 2019 | 349 | 2019 |
Multiview transformers for video recognition S Yan, X Xiong, A Arnab, Z Lu, M Zhang, C Sun, C Schmid CVPR 2022, 2022 | 245 | 2022 |
Actor-Centric Relation Network C Sun, A Shrivastava, C Vondrick, K Murphy, R Sukthankar, C Schmid ECCV 2018, 2018 | 236 | 2018 |