A comprehensive survey on model compression and acceleration

T Choudhary, V Mishra, A Goswami… - Artificial Intelligence …, 2020 - Springer
In recent years, machine learning (ML) and deep learning (DL) have shown remarkable
improvement in computer vision, natural language processing, stock prediction, forecasting …

Deep learning in mobile and wireless networking: A survey

C Zhang, P Patras, H Haddadi - IEEE Communications surveys …, 2019 - ieeexplore.ieee.org
The rapid uptake of mobile devices and the rising popularity of mobile applications and
services pose unprecedented demands on mobile and wireless networking infrastructure …

Snips voice platform: an embedded spoken language understanding system for private-by-design voice interfaces

A Coucke, A Saade, A Ball, T Bluche, A Caulier… - arXiv preprint arXiv …, 2018 - arxiv.org
This paper presents the machine learning architecture of the Snips Voice Platform, a
software solution to perform Spoken Language Understanding on microprocessors typical of …

[PDF][PDF] Semi-orthogonal low-rank matrix factorization for deep neural networks.

D Povey, G Cheng, Y Wang, K Li, H Xu… - Interspeech, 2018 - academia.edu
Abstract Time Delay Neural Networks (TDNNs), also known as onedimensional
Convolutional Neural Networks (1-d CNNs), are an efficient and well-performing neural …

Sequence-level knowledge distillation

Y Kim, AM Rush - arXiv preprint arXiv:1606.07947, 2016 - arxiv.org
Neural machine translation (NMT) offers a novel alternative formulation of translation that is
potentially simpler than statistical approaches. However to reach competitive performance …

Compression of deep learning models for text: A survey

M Gupta, P Agrawal - ACM Transactions on Knowledge Discovery from …, 2022 - dl.acm.org
In recent years, the fields of natural language processing (NLP) and information retrieval (IR)
have made tremendous progress thanks to deep learning models like Recurrent Neural …

Compression of neural machine translation models via pruning

A See, MT Luong, CD Manning - arXiv preprint arXiv:1606.09274, 2016 - arxiv.org
Neural Machine Translation (NMT), like many other deep learning domains, typically suffers
from over-parameterization, resulting in large storage sizes. This paper examines three …

Recent progresses in deep learning based acoustic models

D Yu, J Li - IEEE/CAA Journal of automatica sinica, 2017 - ieeexplore.ieee.org
In this paper, we summarize recent progresses made in deep learning based acoustic
models and the motivation and insights behind the surveyed techniques. We first discuss …

Personalized speech recognition on mobile devices

I McGraw, R Prabhavalkar, R Alvarez… - … , Speech and Signal …, 2016 - ieeexplore.ieee.org
We describe a large vocabulary speech recognition system that is accurate, has low latency,
and yet has a small enough memory and computational footprint to run faster than real-time …

Deep learning on mobile and embedded devices: State-of-the-art, challenges, and future directions

Y Chen, B Zheng, Z Zhang, Q Wang, C Shen… - ACM Computing …, 2020 - dl.acm.org
Recent years have witnessed an exponential increase in the use of mobile and embedded
devices. With the great success of deep learning in many fields, there is an emerging trend …