PhaVoRIT: A phase vocoder for real-time interactive time-stretching

[PDF][PDF] Explicit consistency constraints for STFT spectrograms and their application to phase reconstruction.

J Le Roux, N Ono, S Sagayama - SAPA@ INTERSPEECH, 2008 - Citeseer

As many acoustic signal processing methods, for example for source separation or noise
canceling, operate in the magnitude spectrogram domain, the problem of reconstructing a …

被引用次数：124 相关文章所有 14 个版本

[图书][B] Designing audio effect plugins in C++: for AAX, AU, and VST3 with DSP theory

W Pirkle - 2019 - taylorfrancis.com

Designing Audio Effect Plugins in C++ presents everything you need to know about digital
signal processing in an accessible way. Not just another theory-heavy digital signal …

被引用次数：84 相关文章所有 6 个版本

[PDF] arxiv.org

Augmentation invariant discrete representation for generative spoken language modeling

I Gat, F Kreuk, TA Nguyen, A Lee, J Copet… - arXiv preprint arXiv …, 2022 - arxiv.org

Generative Spoken Language Modeling research focuses on optimizing speech Language
Models (LMs) using raw audio recordings without accessing any textual supervision. Such …

被引用次数：10 相关文章所有 9 个版本

Speech time-scale modification with GANs

E Cohen, F Kreuk, J Keshet - IEEE Signal Processing Letters, 2022 - ieeexplore.ieee.org

While listening to spoken content, it is often desired to vary the speech rate while preserving
the speaker's timbre and pitch. To date, advanced signal processing techniques are used to …

被引用次数：11 相关文章所有 2 个版本

[PDF] psu.edu

Audio pitch shifting using the constant-Q transform

C Schörkhuber, A Klapuri, A Sontacchi - Journal of the Audio Engineering …, 2013 - aes.org

Pitch shifting of polyphonic music is usually performed by manipulating the time–frequency
representation of the input signal such that frequency is scaled by a constant and time …

被引用次数：58 相关文章所有 4 个版本

[PDF] arxiv.org

NAST: Noise Aware Speech Tokenization for Speech Language Models

S Messica, Y Adi - arXiv preprint arXiv:2406.11037, 2024 - arxiv.org

Speech tokenization is the task of representing speech signals as a sequence of discrete
units. Such representations can be later used for various downstream tasks including …

被引用次数：4 相关文章

[PDF] dafx.de

[PDF][PDF] PVSOLA: A phase vocoder with synchronized overlap-add

A Moinet, T Dutoit - Proceedings of the International Conference on Digital …, 2011 - dafx.de

In this paper we present an original method mixing temporal and spectral processing to
reduce the phasiness in the phase vocoder. Phasiness is an inherent artifact of the phase …

被引用次数：31 相关文章所有 7 个版本

Deep learning-based single-ended quality prediction for time-scale modified audio

T Roberts, A Nicolson, KK Paliwal - Journal of the Audio Engineering …, 2021 - aes.org

Objective evaluation of audio processed with Time-Scale Modification (TSM) has recently
seen improvement with a labeled time-scaled audio dataset used to train an objective …

被引用次数：5 相关文章所有 3 个版本

[PDF] googleapis.com

Apparatus, method and computer program for manipulating an audio signal comprising a transient event

F Nagel, A Walther, G Fuchs, J Lecomte… - US Patent …, 2016 - Google Patents

Coie LLP (57) ABSTRACT An apparatus for manipulating an audio signal comprising a
transient event has a transient signal replacer configured to replace a transient signal …

被引用次数：24 相关文章所有 5 个版本

[PDF] mcgill.ca

Modular and adaptive control of sound processing

D Van Nort - 2010 - escholarship.mcgill.ca

La présente dissertation expose une recherche sur la création de systèmes pour le contrôle
de la synthèse et du traitement des sons. Les travaux portant sur le design d'instruments de …

被引用次数：20 相关文章所有 8 个版本