Counterfactual multi-agent policy gradients J Foerster, G Farquhar, T Afouras, N Nardelli, S Whiteson AAAI Conference on Artificial Intelligence, 2017 | 2241 | 2017 |
Deep audio-visual speech recognition T Afouras, JS Chung, A Senior, O Vinyals, A Zisserman IEEE transactions on pattern analysis and machine intelligence 44 (12), 8717 …, 2018 | 845 | 2018 |
Stabilising experience replay for deep multi-agent reinforcement learning J Foerster, N Nardelli, G Farquhar, T Afouras, PHS Torr, P Kohli, ... International Conference on Machine Learning (ICML), 2017 | 748 | 2017 |
LRS3-TED: a large-scale dataset for visual speech recognition T Afouras, JS Chung, A Zisserman arXiv preprint arXiv:1809.00496, 2018 | 420 | 2018 |
The conversation: Deep audio-visual speech enhancement T Afouras, JS Chung, A Zisserman INTERSPEECH, 2018 | 409 | 2018 |
Self-Supervised Learning of Audio-Visual Objects from Video T Afouras, A Owens, JS Chung, A Zisserman European Conference on Computer Vision (ECCV), 2020 | 258 | 2020 |
Localizing visual sounds the hard way A Vedaldi, H Chen, W Xie, T Afouras, A Nagrani, A Zisserman Institute of Electrical and Electronics Engineers, 2021 | 188* | 2021 |
BSL-1K: Scaling up co-articulated sign language recognition using mouthing cues S Albanie, G Varol, L Momeni, T Afouras, JS Chung, N Fox, A Zisserman European Conference on Computer Vision (ECCV), 2020 | 180 | 2020 |
Spot the conversation: speaker diarisation in the wild JS Chung, J Huh, A Nagrani, T Afouras, A Zisserman INTERSPEECH, 2020 | 158 | 2020 |
ASR is all you need: Cross-modal distillation for lip reading T Afouras, JS Chung, A Zisserman ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and …, 2020 | 146 | 2020 |
Deep lip reading: a comparison of models and an online application T Afouras, JS Chung, A Zisserman INTERSPEECH, 2018 | 130 | 2018 |
My lips are concealed: Audio-visual speech enhancement through obstructions T Afouras, JS Chung, A Zisserman arXiv preprint arXiv:1907.04975, 2019 | 97 | 2019 |
Sub-word Level Lip Reading With Visual Attention KR Prajwal, T Afouras, A Zisserman arXiv preprint arXiv:2110.07603, 2021 | 89 | 2021 |
Watch, read and lookup: learning to spot signs from multiple supervisors L Momeni, G Varol, S Albanie, T Afouras, A Zisserman Proceedings of the Asian Conference on Computer Vision, 2020 | 57 | 2020 |
Self-supervised object detection from audio-visual correspondence T Afouras, YM Asano, F Fagan, A Vedaldi, F Metze Proceedings of the IEEE/CVF conference on computer vision and pattern …, 2022 | 53 | 2022 |
Read and attend: Temporal localisation in sign language videos G Varol, L Momeni, S Albanie, T Afouras, A Zisserman Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2021 | 51 | 2021 |
Ego-exo4d: Understanding skilled human activity from first-and third-person perspectives K Grauman, A Westbury, L Torresani, K Kitani, J Malik, T Afouras, ... Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2024 | 49 | 2024 |
Seeing wake words: Audio-visual keyword spotting L Momeni, T Afouras, T Stafylakis, S Albanie, A Zisserman arXiv preprint arXiv:2009.01225, 2020 | 49 | 2020 |
Bbc-oxford british sign language dataset S Albanie, G Varol, L Momeni, H Bull, T Afouras, H Chowdhury, N Fox, ... arXiv preprint arXiv:2111.03635, 2021 | 43 | 2021 |
Audio-visual synchronisation in the wild H Chen, W Xie, T Afouras, A Nagrani, A Vedaldi, A Zisserman arXiv preprint arXiv:2112.04432, 2021 | 37 | 2021 |