Improving speaker diarization of tv series using talking-face detection and clustering
While successful on broadcast news, meetings or telephone conversation, state-of-the-art
speaker diarization techniques tend to perform poorly on TV series or movies. In this paper …
speaker diarization techniques tend to perform poorly on TV series or movies. In this paper …
Going beyond the sentence: Contextual machine translation of dialogue
R Bawden - 2018 - theses.hal.science
While huge progress has been made in machine translation (MT) in recent years, the
majority of MT systems still rely on the assumption that sentences can be translated in …
majority of MT systems still rely on the assumption that sentences can be translated in …
[PDF][PDF] Text-based speaker identification for multi-participant opendomain dialogue systems
Understanding the interactive structure of dialogues, such as turn taking behaviour and
change of speakers, is a critical prerequisite for dialogue systems which aim to understand …
change of speakers, is a critical prerequisite for dialogue systems which aim to understand …
Partners in crime: Multi-view sequential inference for movie understanding
N Papasarantopoulos, L Frermann… - Proceedings of the …, 2019 - aclanthology.org
Multi-view learning algorithms are powerful representation learning tools, often exploited in
the context of multimodal problems. However, for problems requiring inference at the token …
the context of multimodal problems. However, for problems requiring inference at the token …
Structured prediction for speaker identification in tv series
E Knyazeva, G Wisniewski, H Bredin… - Annual Conference of the …, 2015 - hal.science
Though radio and TV broadcast are highly structured documents, state-of-the-art speaker
identification algorithms do not take advantage of this information to improve prediction …
identification algorithms do not take advantage of this information to improve prediction …
Unsupervised person clustering in videos with cross-modal communication
In the existing person identification solutions, multi-modal learning is able to gain a plausible
person identification accuracy in TV Content since supervised information is applied to train …
person identification accuracy in TV Content since supervised information is applied to train …