A survey of techniques for optimizing transformer inference
Recent years have seen a phenomenal rise in the performance and applications of
transformer neural networks. The family of transformer networks, including Bidirectional …
transformer neural networks. The family of transformer networks, including Bidirectional …
Neural architecture search for transformers: A survey
Transformer-based Deep Neural Network architectures have gained tremendous interest
due to their effectiveness in various applications across Natural Language Processing (NLP) …
due to their effectiveness in various applications across Natural Language Processing (NLP) …
Dphubert: Joint distillation and pruning of self-supervised speech models
Self-supervised learning (SSL) has achieved notable success in many speech processing
tasks, but the large model size and heavy computational cost hinder the deployment …
tasks, but the large model size and heavy computational cost hinder the deployment …
Dinosr: Self-distillation and online clustering for self-supervised speech representation learning
In this paper, we introduce self-distillation and online clustering for self-supervised speech
representation learning (DinoSR) which combines masked language modeling, self …
representation learning (DinoSR) which combines masked language modeling, self …
Fithubert: Going thinner and deeper for knowledge distillation of speech self-supervised learning
Large-scale speech self-supervised learning (SSL) has emerged to the main field of speech
processing, however, the problem of computational cost arising from its vast size makes a …
processing, however, the problem of computational cost arising from its vast size makes a …
Superb@ slt 2022: Challenge on generalization and efficiency of self-supervised speech representation learning
We present the SUPERB challenge at SLT 2022, which aims at learning self-supervised
speech representation for better performance, generalization, and efficiency. The challenge …
speech representation for better performance, generalization, and efficiency. The challenge …
Prenas: Preferred one-shot learning towards efficient neural architecture search
The wide application of pre-trained models is driving the trend of once-for-all training in one-
shot neural architecture search (NAS). However, training within a huge sample space …
shot neural architecture search (NAS). However, training within a huge sample space …
Reducing barriers to self-supervised learning: Hubert pre-training with academic compute
Self-supervised learning (SSL) has led to great strides in speech processing. However, the
resources needed to train these models has become prohibitively large as they continue to …
resources needed to train these models has become prohibitively large as they continue to …
Speechclip: Integrating speech with pre-trained vision and language model
Data-driven speech processing models usually perform well with a large amount of text
supervision, but collecting transcribed speech data is costly. Therefore, we propose Speech …
supervision, but collecting transcribed speech data is costly. Therefore, we propose Speech …
Structured pruning of self-supervised pre-trained models for speech recognition and understanding
Self-supervised speech representation learning (SSL) has shown to be effective in various
downstream tasks, but SSL models are usually large and slow. Model compression …
downstream tasks, but SSL models are usually large and slow. Model compression …