The SuperSID project: Exploiting high-level information for high-accuracy speaker recognition

JHL Hansen, T Hasan - IEEE Signal processing magazine, 2015 - ieeexplore.ieee.org

Identifying a person by his or her voice is an important human trait most take for granted in
natural human-to-human interaction/communication. Speaking to someone over the …

被引用次数：798 相关文章所有 6 个版本

[PDF] drwuz.com

Spoofing and countermeasures for speaker verification: A survey

Z Wu, N Evans, T Kinnunen, J Yamagishi, F Alegre… - speech …, 2015 - Elsevier

While biometric authentication has advanced significantly in recent years, evidence shows
the technology can be susceptible to malicious spoofing attacks. The research community …

被引用次数：736 相关文章所有 13 个版本

[PDF] hal.science

An overview of text-independent speaker recognition: From features to supervectors

T Kinnunen, H Li - Speech communication, 2010 - Elsevier

This paper gives an overview of automatic speaker recognition technology, with an
emphasis on text-independent recognition. Speaker recognition has been studied actively …

被引用次数：2020 相关文章所有 26 个版本

[PDF] univ-avignon.fr

An overview of automatic speaker diarization systems

SE Tranter, DA Reynolds - IEEE Transactions on audio, speech …, 2006 - ieeexplore.ieee.org

Audio diarization is the process of annotating an input audio channel with information that
attributes (possibly overlapping) temporal regions of signal energy to their specific sources …

被引用次数：822 相关文章所有 12 个版本

[PDF] nsf.gov

You can hear but you cannot steal: Defending against voice impersonation attacks on smartphones

S Chen, K Ren, S Piao, C Wang… - 2017 IEEE 37th …, 2017 - ieeexplore.ieee.org

Voice, as a convenient and efficient way of information delivery, has a significant advantage
over the conventional keyboard-based input methods, especially on small mobile devices …

被引用次数：153 相关文章所有 17 个版本

[PDF] upf.edu

Jitter and shimmer measurements for speaker recognition

M Farrús, J Hernando, P Ejarque - 8th Annual Conference of the …, 2007 - repositori.upf.edu

Jitter and shimmer are measures of the cycle-to-cycle variations of fundamental frequency
and amplitude, respectively, which have been largely used for the description of …

被引用次数：305 相关文章所有 10 个版本

[PDF] psu.edu

Modeling prosodic feature sequences for speaker recognition

E Shriberg, L Ferrer, S Kajarekar, A Venkataraman… - Speech …, 2005 - Elsevier

We describe a novel approach to modeling idiosyncratic prosodic behavior for automatic
speaker recognition. The approach computes various duration, pitch, and energy features …

被引用次数：311 相关文章所有 11 个版本

[PDF] tuni.fi

A convolutional neural network approach for acoustic scene classification

M Valenti, S Squartini, A Diment… - … Joint Conference on …, 2017 - ieeexplore.ieee.org

This paper presents a novel application of convolutional neural networks (CNNs) for the task
of acoustic scene classification (ASC). We here propose the use of a CNN trained to classify …

被引用次数：125 相关文章所有 7 个版本

[PDF] iiit.ac.in

Extraction and representation of prosodic features for language and speaker recognition

L Mary, B Yegnanarayana - Speech communication, 2008 - Elsevier

In this paper, we propose a new approach for extracting and representing prosodic features
directly from the speech signal. We hypothesize that prosody is linked to linguistic units such …

被引用次数：245 相关文章所有 15 个版本

[PDF] mit.edu

Forensic speaker recognition

JP Campbell, W Shen, WM Campbell… - IEEE Signal …, 2009 - ieeexplore.ieee.org

Looking at the different points highlighted in this article, we affirm that forensic applications
of speaker recognition should still be taken under a necessary need for caution …

被引用次数：239 相关文章所有 11 个版本