[PDF][PDF] The efficiency spectrum of large language models: An algorithmic survey

T Ding, T Chen, H Zhu, J Jiang, Y Zhong… - arXiv preprint arXiv …, 2023 - researchgate.net
The rapid growth of Large Language Models (LLMs) has been a driving force in
transforming various domains, reshaping the artificial general intelligence landscape …

Cdfi: Compression-driven network design for frame interpolation

T Ding, L Liang, Z Zhu… - Proceedings of the IEEE …, 2021 - openaccess.thecvf.com
DNN-based frame interpolation--that generates the intermediate frames given two
consecutive frames--typically relies on heavy model architectures with a huge number of …

Otov2: Automatic, generic, user-friendly

T Chen, L Liang, T Ding, Z Zhu, I Zharkov - arXiv preprint arXiv …, 2023 - arxiv.org
The existing model compression methods via structured pruning typically require
complicated multi-stage procedures. Each individual stage necessitates numerous …

Lorashear: Efficient large language model structured pruning and knowledge recovery

T Chen, T Ding, B Yadav, I Zharkov, L Liang - arXiv preprint arXiv …, 2023 - arxiv.org
Large Language Models (LLMs) have transformed the landscape of artificial intelligence,
while their enormous size presents significant challenges in terms of computational costs …

St-mfnet mini: Knowledge distillation-driven frame interpolation

C Morris, D Danier, F Zhang… - … on Image Processing …, 2023 - ieeexplore.ieee.org
Currently, one of the major challenges in deep learning-based video frame interpolation
(VFI) is the large model size and high computational complexity associated with many high …

An adaptive half-space projection method for stochastic optimization problems with group sparse regularization

Y Dai, T Chen, G Wang, DP Robinson - Transactions on machine …, 2023 - par.nsf.gov
Optimization problems with group sparse regularization are ubiquitous in various popular
downstream applications, such as feature selection and compression for Deep Neural …

Sparsity-guided network design for frame interpolation

T Ding, L Liang, Z Zhu, T Chen, I Zharkov - arXiv preprint arXiv …, 2022 - arxiv.org
DNN-based frame interpolation, which generates intermediate frames from two consecutive
frames, is often dependent on model architectures with a large number of features …

Mapping yolov4-tiny on fpga-based dnn accelerator by using dynamic fixed-point method

P Li, C Che - 2021 12th International Symposium on Parallel …, 2021 - ieeexplore.ieee.org
In the past few decades, with the large-scale application of deep learning technology, the
neural network inference speed problem is becoming more and more severe, especially in …

Implicit compressibility of overparametrized neural networks trained with heavy-tailed SGD

Y Wan, M Barsbey, A Zaidi, U Simsekli - arXiv preprint arXiv:2306.08125, 2023 - arxiv.org
Neural network compression has been an increasingly important subject, not only due to its
practical relevance, but also due to its theoretical implications, as there is an explicit …

OTOv3: Automatic Architecture-Agnostic Neural Network Training and Compression from Structured Pruning to Erasing Operators

T Chen, T Ding, Z Zhu, Z Chen, HT Wu… - arXiv preprint arXiv …, 2023 - arxiv.org
Compressing a predefined deep neural network (DNN) into a compact sub-network with
competitive performance is crucial in the efficient machine learning realm. This topic spans …