A presentation of the REPERE challenge

A Plaquet, H Bredin - arXiv preprint arXiv:2310.13025, 2023 - arxiv.org

Since its introduction in 2019, the whole end-to-end neural diarization (EEND) line of work
has been addressing speaker diarization as a frame-wise multi-label classification problem …

被引用次数：100 相关文章所有 10 个版本

[PDF] hal.science

pyannote. audio 2.1 speaker diarization pipeline: principle, benchmark, and recipe

H Bredin - 24th INTERSPEECH Conference (INTERSPEECH …, 2023 - hal.science

pyannote. audio is an open-source toolkit written in Python for speaker diarization. Version
2.1 introduces a major overhaul of pyannote. audio default speaker diarization pipeline …

被引用次数：121 相关文章所有 18 个版本

Optimization of RNN-based speech activity detection

G Gelly, JL Gauvain - IEEE/ACM Transactions on Audio …, 2017 - ieeexplore.ieee.org

Speech activity detection (SAD) is an essential component of automatic speech recognition
systems impacting the overall system performance. This paper investigates an optimization …

被引用次数：121 相关文章所有 3 个版本

[PDF] academia.edu

[PDF][PDF] A study on automatic speech recognition

S Benkerzaz, Y Elmir, A Dennai - Journal of Information Technology …, 2019 - academia.edu

Speech is an easy and usable technique of communication between humans, but nowadays
humans are not limited to connecting to each other but even to the different machines in our …

被引用次数：59 相关文章所有 2 个版本

[PDF] mmai.io

[PDF][PDF] pyannote. audio speaker diarization pipeline at VoxSRC 2023

S Baroudi, H Bredin, A Plaquet, T Pellegrini - The VoxCeleb Speaker …, 2023 - mmai.io

This technical report describes the submission of team pyannote to the VoxSRC 2023
speaker diarization challenge. It relies on 3 stages: local end-to-end neural speaker …

被引用次数：9 相关文章所有 3 个版本

[PDF] asafvarol.com

A study on automatic speech recognition systems

H Ibrahim, A Varol - … on Digital Forensics and Security (ISDFS), 2020 - ieeexplore.ieee.org

Speech recognition is a technique that enables machines to automatically identify the
human voice through speech signals. In other words, it helps create a communication link …

被引用次数：31 相关文章所有 3 个版本

[PDF] isca-archive.org

[PDF][PDF] The first official repere evaluation

O Galibert, J Kahn - First Workshop on Speech, Language and …, 2013 - isca-archive.org

The REPERE Challenge aims to support research on people recognition in multimodal
conditions. Following a 2012 dryrun [1], the first official evaluation of systems has been …

被引用次数：66 相关文章所有 7 个版本

[PDF] arxiv.org

Audio-Visual Speaker Diarization: Current Databases, Approaches and Challenges

V Mingote, A Ortega, A Miguel, E Lleida - arXiv preprint arXiv:2409.05659, 2024 - arxiv.org

Nowadays, the large amount of audio-visual content available has fostered the need to
develop new robust automatic speaker diarization systems to analyse and characterise it …

Fabiole, a speech database for forensic speaker comparison

M Ajili, JF Bonastre, J Kahn, S Rossato… - Proceedings of the …, 2016 - aclanthology.org

A speech database has been collected for use to highlight the importance of “speaker factor”
in forensic voice comparison. FABIOLE has been created during the FABIOLE project …

被引用次数：39 相关文章所有 5 个版本

[PDF] hal.science

Unsupervised speaker identification in TV broadcast based on written names

J Poignant, L Besacier, G Quénot - IEEE/ACM Transactions on …, 2014 - ieeexplore.ieee.org

Identifying speakers in TV broadcast in an unsupervised way (ie, without biometric models)
is a solution for avoiding costly annotations. Existing methods usually use pronounced …

被引用次数：50 相关文章所有 12 个版本