" Sheldon speaking, Bonjour!" Leveraging Multilingual Tracks for (Weakly) Supervised Speaker...

H Bredin, G Gelly - Proceedings of the 24th ACM international …, 2016 - dl.acm.org

While successful on broadcast news, meetings or telephone conversation, state-of-the-art
speaker diarization techniques tend to perform poorly on TV series or movies. In this paper …

被引用次数：39 相关文章所有 2 个版本

[PDF] hal.science

Going beyond the sentence: Contextual machine translation of dialogue

R Bawden - 2018 - theses.hal.science

While huge progress has been made in machine translation (MT) in recent years, the
majority of MT systems still rely on the assumption that sentences can be translated in …

被引用次数：14 相关文章所有 4 个版本

[PDF] blueanalysis.com

[PDF][PDF] Text-based speaker identification for multi-participant opendomain dialogue systems

IV Serban, J Pineau - NIPS Workshop on Machine Learning for …, 2015 - blueanalysis.com

Understanding the interactive structure of dialogues, such as turn taking behaviour and
change of speakers, is a critical prerequisite for dialogue systems which aim to understand …

被引用次数：10 相关文章

[PDF] aclanthology.org

Partners in crime: Multi-view sequential inference for movie understanding

N Papasarantopoulos, L Frermann… - Proceedings of the …, 2019 - aclanthology.org

Multi-view learning algorithms are powerful representation learning tools, often exploited in
the context of multimodal problems. However, for problems requiring inference at the token …

被引用次数：5 相关文章所有 2 个版本

[PDF] hal.science

Structured prediction for speaker identification in tv series

E Knyazeva, G Wisniewski, H Bredin… - Annual Conference of the …, 2015 - hal.science

Though radio and TV broadcast are highly structured documents, state-of-the-art speaker
identification algorithms do not take advantage of this information to improve prediction …

被引用次数：6 相关文章所有 7 个版本

Unsupervised person clustering in videos with cross-modal communication

C Miao, J Feng, Y Ding, Y Yang… - … and Image Processing …, 2016 - ieeexplore.ieee.org

In the existing person identification solutions, multi-modal learning is able to gain a plausible
person identification accuracy in TV Content since supervised information is applied to train …

被引用次数：2 相关文章

[引用][C] Nicolas PÉCHEUX

S Schools - Learning, 2012