Data-selective transfer learning for multi-domain speech recognition
Negative transfer in training of acoustic models for automatic speech recognition has been
reported in several contexts such as domain change or speaker characteristics. This paper …
reported in several contexts such as domain change or speaker characteristics. This paper …
Automatic genre and show identification of broadcast media
Huge amounts of digital videos are being produced and broadcast every day, leading to
giant media archives. Effective techniques are needed to make such data accessible further …
giant media archives. Effective techniques are needed to make such data accessible further …
Latent dirichlet allocation based organisation of broadcast media archives for deep neural network adaptation
This paper presents a new method for the discovery of latent domains in diverse speech
data, for the use of adaptation of Deep Neural Networks (DNNs) for Automatic Speech …
data, for the use of adaptation of Deep Neural Networks (DNNs) for Automatic Speech …
Unsupervised domain discovery using latent dirichlet allocation for acoustic modelling in speech recognition
Speech recognition systems are often highly domain dependent, a fact widely reported in
the literature. However the concept of domain is complex and not bound to clear criteria …
the literature. However the concept of domain is complex and not bound to clear criteria …
Group feature selection for audio-based video genre classification
G Sageder, M Zaharieva, C Breiteneder - … 2016, Miami, FL, USA, January 4 …, 2016 - Springer
The performance of video genre classification approaches strongly depends on the selected
feature set. Feature selection requires for expert knowledge and is commonly driven by the …
feature set. Feature selection requires for expert knowledge and is commonly driven by the …
GenSpecVidOnt: a reference ontology for knowledge based video analytics with multimodal genre detection
Video analytics refers to the process of automatically analysing a video for spatial and
temporal events. Effective video analytics require exploitation of genre-related information …
temporal events. Effective video analytics require exploitation of genre-related information …
[PDF][PDF] webASR 2-Improved Cloud Based Speech Technology.
This paper presents the most recent developments of the webASR service (www. webasr.
org), the world's first web–based fully functioning automatic speech recognition platform for …
org), the world's first web–based fully functioning automatic speech recognition platform for …
Color-independent classification of animation video
R Zumer, S Ratté - International Journal of Multimedia Information …, 2018 - Springer
This paper presents a method for the classification of animated video that does not rely on
hue or saturation information, and aims to achieve a high level of performance in the context …
hue or saturation information, and aims to achieve a high level of performance in the context …
[HTML][HTML] Acoustic adaptation to dynamic background conditions with asynchronous transformations
This paper proposes a framework for performing adaptation to complex and non-stationary
background conditions in Automatic Speech Recognition (ASR) by means of asynchronous …
background conditions in Automatic Speech Recognition (ASR) by means of asynchronous …
Emotion recognition from the speech signal by effective combination of generative and discriminative models
In this paper, we propose an effective way for combining the discriminative and generative
models for emotion recognition from speech signal. Finding an efficient feature extraction …
models for emotion recognition from speech signal. Finding an efficient feature extraction …