David Harwath 个人学术档案

引用次数

	总计	2019 年至今
引用	2421	2139
h 指数	25	25
i10 指数	37	33

580

290

145

435

20132014201520162017201820192020202120222023202414 19 30 41 69 100 180 244 320 381 571 438

开放获取的出版物数量

查看全部

5 篇文章

0 篇文章

可查看的文章

无法查看的文章

根据资助方的强制性开放获取政策

合著作者

James GlassMIT Computer Science and Artificial Intelligence Laboratory在 mit.edu 的电子邮件经过验证
Rogerio FerisResearch Manager, MIT-IBM Watson AI Lab在 us.ibm.com 的电子邮件经过验证
Puyuan PengResearch Intern, Meta; PhD student, The University of Texas at Austin在 utexas.edu 的电子邮件经过验证
Hilde KuehneUniversity of Bonn , MIT-IBM Watson Lab在 uni-bonn.de 的电子邮件经过验证
Samuel ThomasIBM Research AI在 us.ibm.com 的电子邮件经过验证
Andrew RouditchenkoPhD Student at MIT CSAIL在 mit.edu 的电子邮件经过验证
Antonio TorralbaProfessor of Computer Science, MIT在 csail.mit.edu 的电子邮件经过验证
Brian KingsburyDistinguished Research Staff Member and Manager, IBM T. J. Watson Research Center, Yorktown Heights在 us.ibm.com 的电子邮件经过验证
Angie BoggustMassachusetts Institute of Technology在 mit.edu 的电子邮件经过验证
Michael Alan PichenyNYU - Courant CS and CDS在 nyu.edu 的电子邮件经过验证
Wei-Ning HsuFacebook AI Research (FAIR)在 csail.mit.edu 的电子邮件经过验证
Alexander H. LiuMassachusetts Institute of Technology在 mit.edu 的电子邮件经过验证
Layne BerryPhD Student, University of Texas at Austin在 utexas.edu 的电子邮件经过验证
Nina ShvetsovaUniversity of Bonn在 uni-frankfurt.de 的电子邮件经过验证
Rameswar PandaResearch Scientist, MIT-IBM Watson AI Lab在 ibm.com 的电子邮件经过验证
Brian ChenColumbia University在 columbia.edu 的电子邮件经过验证
Galen ChuangUC Berkeley在 berkeley.edu 的电子邮件经过验证
Eunsol ChoiThe University of Texas at Austin在 utexas.edu 的电子邮件经过验证
Adrià RecasensResearch Scientist, DeepMind在 google.com 的电子邮件经过验证
Dídac SurísPhD student, Columbia University在 columbia.edu 的电子邮件经过验证

关注

David Harwath

The University of Texas at Austin

在 utexas.edu 的电子邮件经过验证

Speech and Language Processing Computer Vision Natural Language Processing Artificial Intelligence Machine Learning


标题按引用次数排序按年份排序按标题排序	引用次数引用次数	年份
Unsupervised learning of spoken language with visual context D Harwath, A Torralba, J Glass Advances in neural information processing systems 29, 2016	291	2016
Jointly discovering visual objects and spoken words from raw sensory input D Harwath, A Recasens, D Surís, G Chuang, A Torralba, J Glass Proceedings of the European conference on computer vision (ECCV), 649-665, 2018	234	2018
Deep multimodal semantic embeddings for speech and images D Harwath, J Glass 2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU …, 2015	180	2015
Avlnet: Learning audio-visual language representations from instructional videos A Rouditchenko, A Boggust, D Harwath, B Chen, D Joshi, S Thomas, ... arXiv preprint arXiv:2006.09199, 2020	141	2020
Everything at once-multi-modal fusion transformer for video retrieval N Shvetsova, B Chen, A Rouditchenko, S Thomas, B Kingsbury, RS Feris, ... Proceedings of the ieee/cvf conference on computer vision and pattern …, 2022	136	2022
A summary of the 2012 JHU CLSP workshop on zero resource speech technologies and models of early language acquisition A Jansen, E Dupoux, S Goldwater, M Johnson, S Khudanpur, K Church, ... 2013 IEEE International Conference on Acoustics, Speech and Signal …, 2013	120	2013
Learning word-like units from joint audio-visual analysis D Harwath, JR Glass arXiv preprint arXiv:1701.07481, 2017	118	2017
Learning hierarchical discrete linguistic units from visually-grounded speech D Harwath, WN Hsu, J Glass arXiv preprint arXiv:1911.09602, 2019	99	2019
Contrastive audio-visual masked autoencoder Y Gong, A Rouditchenko, AH Liu, D Harwath, L Karlinsky, H Kuehne, ... arXiv preprint arXiv:2210.07839, 2022	95	2022
Mae-ast: Masked autoencoding audio spectrogram transformer A Baade, P Peng, D Harwath arXiv preprint arXiv:2203.16691, 2022	84	2022
Multimodal clustering networks for self-supervised learning from unlabeled videos B Chen, A Rouditchenko, K Duarte, H Kuehne, S Thomas, A Boggust, ... Proceedings of the IEEE/CVF International Conference on Computer Vision …, 2021	78	2021
Text-free image-to-speech synthesis using learned segmental units WN Hsu, D Harwath, C Song, J Glass arXiv preprint arXiv:2012.15454, 2020	68	2020
Vision as an interlingua: Learning multilingual semantic embeddings of untranscribed speech D Harwath, G Chuang, J Glass 2018 IEEE International Conference on Acoustics, Speech and Signal …, 2018	66	2018
Spoken moments: Learning joint audio-visual representations from video descriptions M Monfort, SY Jin, A Liu, D Harwath, R Feris, J Glass, A Oliva Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2021	64	2021
Towards visually grounded sub-word speech unit discovery D Harwath, J Glass ICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and …, 2019	43	2019
Why is winoground hard? investigating failures in visuolinguistic compositionality A Diwan, L Berry, E Choi, D Harwath, K Mahowald arXiv preprint arXiv:2211.00768, 2022	39	2022
Word discovery in visually grounded, self-supervised speech models P Peng, D Harwath arXiv preprint arXiv:2203.15081, 2022	38	2022
Learning modality-invariant representations for speech and images K Leidal, D Harwath, J Glass 2017 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU …, 2017	33	2017
Look, Listen, and Decode: Multimodal Speech Recognition with Images F Sun, D Harwath, J Glass IEEE Workshop on Spoken Language Technology, 2016	32	2016
Prompting the hidden talent of web-scale speech models for zero-shot task generalization P Peng, B Yan, S Watanabe, D Harwath arXiv preprint arXiv:2305.11095, 2023	31	2023

系统目前无法执行此操作，请稍后再试。

文章 1–20

每年引用数

重复的引用

合并的引用

添加合著者合著作者

上传 PDF

关注此作者

引用次数

合著作者

引用