Compression of CTC-Trained Acoustic Models by Dynamic Frame-Wise Distillation or Segment-Wise...

黄震华，杨顺志，林威，倪娟，孙圣力，陈运文，汤庸 - 计算机学报, 2022 - 159.226.43.17

摘要高性能的深度学习网络通常是计算型和参数密集型的, 难以应用于资源受限的边缘设备.
为了能够在低资源设备上运行深度学习模型, 需要研发高效的小规模网络 …

Asr is all you need: Cross-modal distillation for lip reading

T Afouras, JS Chung… - ICASSP 2020-2020 IEEE …, 2020 - ieeexplore.ieee.org

The goal of this work is to train strong models for visual speech recognition without requiring
human annotated ground truth data. We achieve this by distilling from an Automatic Speech …

被引用次数：142 相关文章所有 11 个版本

[PDF] aaai.org

Self-Distillation Regularized Connectionist Temporal Classification Loss for Text Recognition: A Simple Yet Effective Approach

Z Zhang, N Lu, M Liao, Y Huang, C Li… - Proceedings of the …, 2024 - ojs.aaai.org

Text recognition methods are gaining rapid development. Some advanced techniques, eg,
powerful modules, language models, and un-and semi-supervised learning schemes …

被引用次数：1 相关文章所有 3 个版本

[PDF] arxiv.org

Distilling the Knowledge of BERT for CTC-based ASR

H Futami, H Inaguma, M Mimura, S Sakai… - arXiv preprint arXiv …, 2022 - arxiv.org

Connectionist temporal classification (CTC)-based models are attractive because of their
fast inference in automatic speech recognition (ASR). Language model (LM) integration …

被引用次数：7 相关文章所有 2 个版本

Distilling attention weights for CTC-based ASR systems

T Moriya, H Sato, T Tanaka, T Ashihara… - ICASSP 2020-2020 …, 2020 - ieeexplore.ieee.org

We present a novel training approach for connectionist temporal classification (CTC)-based
automatic speech recognition (ASR) systems. CTC models are promising for building both a …

被引用次数：14 相关文章所有 2 个版本

[PDF] arxiv.org

Swing distillation: A privacy-preserving knowledge distillation framework

J Li, X Wu, W Dong, S Wu, C Bian, D Xiong - arXiv preprint arXiv …, 2022 - arxiv.org

Knowledge distillation (KD) has been widely used for model compression and knowledge
transfer. Typically, a big teacher model trained on sufficient data transfers knowledge to a …

被引用次数：3 相关文章所有 2 个版本

Improving knowledge distillation of CTC-trained acoustic models with alignment-consistent ensemble and target delay

H Ding, K Chen, Q Huo - IEEE/ACM transactions on audio …, 2020 - ieeexplore.ieee.org

Knowledge distillation (KD) has been widely used to improve the performance of a simpler
student model by imitating the outputs or intermediate representations of a more complex …

被引用次数：6 相关文章所有 3 个版本

[PDF] ox.ac.uk

Audio-visual deep learning

T Afouras - 2021 - ora.ox.ac.uk

Human perception and learning are inherently multimodal: we interface with the world
through multiple sensory streams, including vision, audition, touch, olfaction and taste. By …