An overview of voice conversion systems
SH Mohammadi, A Kain - Speech Communication, 2017 - Elsevier
Voice transformation (VT) aims to change one or more aspects of a speech signal while
preserving linguistic information. A subset of VT, Voice conversion (VC) specifically aims to …
preserving linguistic information. A subset of VT, Voice conversion (VC) specifically aims to …
Deepfakes as a threat to a speaker and facial recognition: An overview of tools and attack vectors
Deepfakes present an emerging threat in cyberspace. Recent developments in machine
learning make deepfakes highly believable, and very difficult to differentiate between what is …
learning make deepfakes highly believable, and very difficult to differentiate between what is …
Utilizing AlexNet deep transfer learning for ear recognition
Transfer Learning is an efficient approach of solving classification problem with little amount
of data. In this paper, we applied Transfer Learning to the well-known AlexNet Convolution …
of data. In this paper, we applied Transfer Learning to the well-known AlexNet Convolution …
Generation and detection of manipulated multimodal audiovisual content: Advances, trends and open challenges
Generative deep learning techniques have invaded the public discourse recently. Despite
the advantages, the applications to disinformation are concerning as the counter-measures …
the advantages, the applications to disinformation are concerning as the counter-measures …
Voice conversion using deep neural networks with speaker-independent pre-training
SH Mohammadi, A Kain - 2014 IEEE Spoken Language …, 2014 - ieeexplore.ieee.org
In this study, we trained a deep autoencoder to build compact representations of short-term
spectra of multiple speakers. Using this compact representation as mapping features, we …
spectra of multiple speakers. Using this compact representation as mapping features, we …
[PDF][PDF] Voice Conversion Across Arbitrary Speakers Based on a Single Target-Speaker Utterance.
Developing a voice conversion (VC) system for a particular speaker typically requires
considerable data from both the source and target speakers. This paper aims to effectuate …
considerable data from both the source and target speakers. This paper aims to effectuate …
GPU-based parallel optimization of immune convolutional neural network and embedded system
T Gong, T Fan, J Guo, Z Cai - Engineering Applications of Artificial …, 2017 - Elsevier
Up to now, the image recognition system has been utilized more and more widely in the
security monitoring, the industrial intelligent monitoring, the unmanned vehicle, and even the …
security monitoring, the industrial intelligent monitoring, the unmanned vehicle, and even the …
Towards low-resource stargan voice conversion using weight adaptive instance normalization
Many-to-many voice conversion with non-parallel training data has seen significant progress
in recent years. It is challenging because of lacking of ground truth parallel data. StarGAN …
in recent years. It is challenging because of lacking of ground truth parallel data. StarGAN …
The protection of megascience projects from deepfake technologies threats: information law aspects
EI Galyashina, VD Nikishin - Journal of Physics: Conference …, 2022 - iopscience.iop.org
The paper examines the potential threats of the malicious use of deepfake technology to
destabilize and discredit megascience projects in the global information space. The …
destabilize and discredit megascience projects in the global information space. The …
[PDF][PDF] Deep transfer learning for human identification based on footprint: A comparative study
MMA Abuqadumah, MAM Ali… - … of Engineering and …, 2019 - researchgate.net
Identifying people based on their footprint has not yet gained enough attention from the
researchers. Therefore, in this paper, an investigation of human identification conducted …
researchers. Therefore, in this paper, an investigation of human identification conducted …