作者
Takuya Koumura, Hiroki Terashima, Shigeto Furukawa
发表日期
2020/1/1
期刊
Acoustical Science and Technology
卷号
41
期号
1
页码范围
337-340
出版商
ACOUSTICAL SOCIETY OF JAPAN
简介
2. Methods 2.1. Auditory model Both the texture representation and content representation are calculated with the model of the auditory system proposed by McDermott and Simoncelli [1]. The model consists of bandpass filter banks and static nonlinear functions. First, an input sound is divided into frequency bands by a bandpass filter bank. Then, amplitude envelopes of the filtered waveforms are calculated by Hilbert transformation and nonlinear compression. Finally, the amplitude envelopes are fed to another bandpass filter bank. The first bandpass filter bank and the nonlinear compression model the function of a cochlea. Amplitude envelopes are supposed to be represented in auditory nerves. The second filter bank models the modulation filter bank suggested to be implemented in the auditory system.
2.2. Representation of sound texture We employed the representation of sound texture proposed by McDermott and Simoncelli [1], that is, temporally marginal statistics of the representation in the auditory model: variances of the outputs of the cochlear filters, the means, variances, and skewnesses of the amplitude envelopes, correlation coefficients of the amplitude envelopes of the different frequency bands, and powers of the outputs of the modulation filters.
学术搜索中的文章