Investigation of segmentation in i-vector based speaker diarization of telephone speech
The goal of this paper is to evaluate the contribution of speaker change detection (SCD) to
the performance of a speaker diarization system in the telephone domain. We compare the …
the performance of a speaker diarization system in the telephone domain. We compare the …
Feature adaptation using linear spectro-temporal transform for robust speech recognition
Spectral information represents short-term speech information within a frame of a few tens of
milliseconds, while temporal information captures the evolution of speech statistics over …
milliseconds, while temporal information captures the evolution of speech statistics over …
[PDF][PDF] Novel Approach to Live Captioning Through Re-speaking: Tailoring Speech Recognition to Re-speaker's Needs.
A novel approach to the live captioning through re-speaking is introduced in this paper. We
describe our concept of respeaking using only one re-speaker with enhanced re-speaker …
describe our concept of respeaking using only one re-speaker with enhanced re-speaker …
Experiments with segmentation in an online speaker diarization system
In offline speaker diarization systems, particularly those aimed at telephone speech, the
accuracy of the initial segmentation of a conversation is often a secondary concern …
accuracy of the initial segmentation of a conversation is often a secondary concern …
Captioning of live TV programs through speech recognition and re-speaking
In this paper we introduce our complete solution for captioning of live TV programs used by
the Czech Television, the public service broadcaster in the Czech Republic. Live captioning …
the Czech Television, the public service broadcaster in the Czech Republic. Live captioning …
[PDF][PDF] 'DID THE SPEAKER CHANGE?': TEMPORAL TRACKING FOR OVERLAPPING SPEAKER
SINMS SCENARIOS - 2022 - core.ac.uk
Diarization systems are an essential part of many speech processing applications, such as
speaker indexing, improving automatic speech recognition (ASR) performance and making …
speaker indexing, improving automatic speech recognition (ASR) performance and making …
Robust adaptation techniques dealing with small amount of data
The worst problem the adaptation is dealing with is the lack of adaptation data. This work
focuses on the feature Maximum Likelihood Linear Regression (fMLLR) adaptation where …
focuses on the feature Maximum Likelihood Linear Regression (fMLLR) adaptation where …
Bottleneck ANN: Dealing with small amount of data in shift-MLLR adaptation
The aim of this work is to propose a refinement of the shift-MLLR (shift Maximum Likelihood
Linear Regression) adaptation of an acoustics model in the case of limited amount of …
Linear Regression) adaptation of an acoustics model in the case of limited amount of …
Convolutional neural network for refinement of speaker adaptation transformation
The aim of this work is to propose a refinement of the shift-MLLR (shift Maximum Likelihood
Linear Regression) adaptation of an acoustics model in the case of limited amount of …
Linear Regression) adaptation of an acoustics model in the case of limited amount of …
Vysokodimenzionální prostory a modelování v úloze rozpoznávání řečníka
L Machlica - 2013 - otik.uk.zcu.cz
The automatic speaker recognition made a significant progress in the last two decades.
Huge speech corpora containing thousands of speakers recorded on several channels are …
Huge speech corpora containing thousands of speakers recorded on several channels are …