Background-tracking acoustic features for genre identification of broadcast shows

M Doulaty, O Saz, T Hain - arXiv preprint arXiv:1509.02409, 2015 - arxiv.org

Negative transfer in training of acoustic models for automatic speech recognition has been
reported in several contexts such as domain change or speaker characteristics. This paper …

被引用次数：21 相关文章所有 11 个版本

[PDF] arxiv.org

Automatic genre and show identification of broadcast media

M Doulaty, O Saz, RWM Ng, T Hain - arXiv preprint arXiv:1606.03333, 2016 - arxiv.org

Huge amounts of digital videos are being produced and broadcast every day, leading to
giant media archives. Effective techniques are needed to make such data accessible further …

被引用次数：16 相关文章所有 9 个版本

[PDF] arxiv.org

Latent dirichlet allocation based organisation of broadcast media archives for deep neural network adaptation

M Doulaty, O Saz, RWM Ng… - 2015 IEEE Workshop on …, 2015 - ieeexplore.ieee.org

This paper presents a new method for the discovery of latent domains in diverse speech
data, for the use of adaptation of Deep Neural Networks (DNNs) for Automatic Speech …

被引用次数：14 相关文章所有 6 个版本

[PDF] arxiv.org

Unsupervised domain discovery using latent dirichlet allocation for acoustic modelling in speech recognition

M Doulaty, O Saz, T Hain - arXiv preprint arXiv:1509.02412, 2015 - arxiv.org

Speech recognition systems are often highly domain dependent, a fact widely reported in
the literature. However the concept of domain is complex and not bound to clear criteria …

被引用次数：16 相关文章所有 11 个版本

Group feature selection for audio-based video genre classification

G Sageder, M Zaharieva, C Breiteneder - … 2016, Miami, FL, USA, January 4 …, 2016 - Springer

The performance of video genre classification approaches strongly depends on the selected
feature set. Feature selection requires for expert knowledge and is commonly driven by the …

被引用次数：11 相关文章所有 6 个版本

GenSpecVidOnt: a reference ontology for knowledge based video analytics with multimodal genre detection

MU Sreeja, BC Kovoor - Multimedia Tools and Applications, 2023 - Springer

Video analytics refers to the process of automatically analysing a video for spatial and
temporal events. Effective video analytics require exploitation of genre-related information …

[PDF][PDF] webASR 2-Improved Cloud Based Speech Technology.

T Hain, J Christian, O Saz, S Deena, M Hasan… - …, 2016 - academia.edu

This paper presents the most recent developments of the webASR service (www. webasr.
org), the world's first web–based fully functioning automatic speech recognition platform for …

被引用次数：5 相关文章所有 8 个版本

Color-independent classification of animation video

R Zumer, S Ratté - International Journal of Multimedia Information …, 2018 - Springer

This paper presents a method for the classification of animated video that does not rely on
hue or saturation information, and aims to achieve a high level of performance in the context …

被引用次数：3 相关文章所有 4 个版本

[HTML] sciencedirect.com

[HTML][HTML] Acoustic adaptation to dynamic background conditions with asynchronous transformations

O Saz, T Hain - Computer Speech & Language, 2017 - Elsevier

This paper proposes a framework for performing adaptation to complex and non-stationary
background conditions in Automatic Speech Recognition (ASR) by means of asynchronous …

被引用次数：2 相关文章所有 9 个版本

[PDF] whiterose.ac.uk

Emotion recognition from the speech signal by effective combination of generative and discriminative models

E Loweimi, M Doulaty, J Barker, T Hain - 2015 - eprints.whiterose.ac.uk

In this paper, we propose an effective way for combining the discriminative and generative
models for emotion recognition from speech signal. Finding an efficient feature extraction …

被引用次数：1 相关文章所有 5 个版本