关注
Zhihao Du
Zhihao Du
Alibaba
在 alibaba-inc.com 的电子邮件经过验证
标题
引用次数
引用次数
年份
M2MeT: The ICASSP 2022 multi-channel multi-party meeting transcription challenge
F Yu, S Zhang, Y Fu, L Xie, S Zheng, Z Du, W Huang, P Guo, Z Yan, B Ma, ...
ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and …, 2022
722022
Summary on the ICASSP 2022 multi-channel multi-party meeting transcription grand challenge
F Yu, S Zhang, P Guo, Y Fu, Z Du, S Zheng, W Huang, L Xie, ZH Tan, ...
ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and …, 2022
252022
A joint framework of denoising autoencoder and generative vocoder for monaural speech enhancement
Z Du, X Zhang, J Han
IEEE/ACM Transactions on Audio, Speech, and Language Processing 28, 1493-1505, 2020
222020
Funasr: A fundamental end-to-end speech recognition toolkit
Z Gao, Z Li, J Wang, H Luo, X Shi, M Chen, Y Li, L Zuo, Z Du, Z Xiao, ...
arXiv preprint arXiv:2305.11013, 2023
202023
Lauragpt: Listen, attend, understand, and regenerate audio with gpt
Q Chen, Y Chu, Z Gao, Z Li, K Hu, X Zhou, J Xu, Z Ma, W Wang, S Zheng, ...
arXiv preprint arXiv:2310.04673, 2023
192023
Funcodec: A fundamental, reproducible and integrable open-source toolkit for neural speech codec
Z Du, S Zhang, K Hu, S Zheng
ICASSP 2024-2024 IEEE International Conference on Acoustics, Speech and …, 2024
182024
Acoustic scene classification by implicitly identifying distinct sound events
H Song, J Han, S Deng, Z Du
arXiv preprint arXiv:1904.05204, 2019
182019
Self-Supervised Adversarial Multi-Task Learning for Vocoder-Based Monaural Speech Enhancement.
Z Du, M Lei, J Han, S Zhang
Interspeech, 3271-3275, 2020
132020
MFCCA: Multi-Frame Cross-Channel attention for multi-speaker ASR in Multi-party meeting scenario
F Yu, S Zhang, P Guo, Y Liang, Z Du, Y Lin, L Xie
2022 IEEE Spoken Language Technology Workshop (SLT), 144-151, 2023
112023
Speaker Overlap-aware Neural Diarization for Multi-party Meeting Analysis
Z Du, S Zhang, S Zheng, Z Yan
EMNLP 2022, 2022
102022
A comparative study on speaker-attributed automatic speech recognition in multi-party meetings
F Yu, Z Du, S Zhang, Y Lin, L Xie
arXiv preprint arXiv:2203.16834, 2022
102022
LauraGPT: Listen, attend, understand, and regenerate audio with GPT
J Wang, Z Du, Q Chen, Y Chu, Z Gao, Z Li, K Hu, X Zhou, J Xu, Z Ma, ...
92023
An efficient joint training framework for robust small-footprint keyword spotting
Y Gu, Z Du, H Zhang, X Zhang
International Conference on Neural Information Processing, 12-23, 2020
8*2020
An Embarrassingly Simple Approach for LLM with Strong ASR Capacity
Z Ma, G Yang, Y Yang, Z Gao, J Wang, Z Du, F Yu, Q Chen, S Zheng, ...
arXiv preprint arXiv:2402.08846, 2024
52024
Pan: Phoneme-aware network for monaural speech enhancement
Z Du, M Lei, J Han, S Zhang
ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and …, 2020
52020
A comparative study on multichannel speaker-attributed automatic speech recognition in multi-party meetings
M Shi, J Zhang, Z Du, F Yu, Q Chen, S Zhang, LR Dai
2023 Asia Pacific Signal and Information Processing Association Annual …, 2023
42023
Told: A novel two-stage overlap-aware framework for speaker diarization
J Wang, Z Du, S Zhang
ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and …, 2023
42023
Casa-asr: Context-aware speaker-attributed asr
M Shi, Z Du, Q Chen, F Yu, Y Li, S Zhang, J Zhang, LR Dai
arXiv preprint arXiv:2305.12459, 2023
32023
Capturing temporal dependencies through future prediction for CNN-based audio classifiers
H Song, J Han, S Deng, Z Du
ICASSP 2021-2021 IEEE International Conference on Acoustics, Speech and …, 2021
32021
The Second Multi-Channel Multi-Party Meeting Transcription Challenge (M2MeT 2.0): A Benchmark for Speaker-Attributed ASR
Y Liang, M Shi, F Yu, Y Li, S Zhang, Z Du, Q Chen, L Xie, Y Qian, J Wu, ...
2023 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU), 1-8, 2023
22023
系统目前无法执行此操作,请稍后再试。
文章 1–20