Jpeg-act: accelerating deep learning via transform-based lossy compression

S Liu, B Guo, C Fang, Z Wang, S Luo… - … Surveys & Tutorials, 2023 - ieeexplore.ieee.org

The emerging field of artificial intelligence of things (AIoT, AI+ IoT) is driven by the
widespread use of intelligent infrastructures and the impressive success of deep learning …

被引用次数：10 相关文章所有 6 个版本

[PDF] arxiv.org

Spatten: Efficient sparse attention architecture with cascade token and head pruning

H Wang, Z Zhang, S Han - 2021 IEEE International Symposium …, 2021 - ieeexplore.ieee.org

The attention mechanism is becoming increasingly popular in Natural Language Processing
(NLP) applications, showing superior performance than convolutional and recurrent …

被引用次数：303 相关文章所有 6 个版本

[PDF] ieee.org

An overview of energy-efficient hardware accelerators for on-device deep-neural-network training

J Lee, HJ Yoo - IEEE Open Journal of the Solid-State Circuits …, 2021 - ieeexplore.ieee.org

Deep Neural Networks (DNNs) have been widely used in various artificial intelligence (AI)
applications due to their overwhelming performance. Furthermore, recently, several …

被引用次数：29 相关文章所有 2 个版本

[PDF] openreview.net

EXACT: Scalable graph neural networks training via extreme activation compression

Z Liu, K Zhou, F Yang, L Li, R Chen… - … Conference on Learning …, 2021 - openreview.net

Training Graph Neural Networks (GNNs) on large graphs is a fundamental challenge due to
the high memory usage, which is mainly occupied by activations (eg, node embeddings) …

被引用次数：58 相关文章

[PDF] thecvf.com

Rep-net: Efficient on-device learning via feature reprogramming

L Yang, AS Rakin, D Fan - … of the IEEE/CVF Conference on …, 2022 - openaccess.thecvf.com

Transfer learning, where the goal is to transfer the well-trained deep learning models from a
primary source task to a new task, is a crucial learning scheme for on-device machine …

被引用次数：24 相关文章所有 5 个版本

[PDF] neurips.cc

Back razor: Memory-efficient transfer learning by self-sparsified backpropagation

Z Jiang, X Chen, X Huang, X Du… - Advances in neural …, 2022 - proceedings.neurips.cc

Transfer learning from the model trained on large datasets to customized downstream tasks
has been widely used as the pre-trained model can greatly boost the generalizability …

被引用次数：9 相关文章所有 4 个版本

[PDF] neurips.cc

Ac-gc: Lossy activation compression with guaranteed convergence

RD Evans, T Aamodt - Advances in Neural Information …, 2021 - proceedings.neurips.cc

Parallel hardware devices (eg, graphics processor units) have limited high-bandwidth
memory capacity. This negatively impacts the training of deep neural networks (DNNs) by …

被引用次数：19 相关文章所有 5 个版本

[PDF] arxiv.org

Comet: a novel memory-efficient deep learning training framework by using error-bounded lossy compression

S Jin, C Zhang, X Jiang, Y Feng, H Guan, G Li… - arXiv preprint arXiv …, 2021 - arxiv.org

Training wide and deep neural networks (DNNs) require large amounts of storage resources
such as memory because the intermediate activation data must be saved in the memory …

被引用次数：25 相关文章所有 10 个版本

[PDF] neurips.cc

Fine-tuning language models over slow networks using activation quantization with guarantees

J Wang, B Yuan, L Rimanic, Y He… - Advances in …, 2022 - proceedings.neurips.cc

Communication compression is a crucial technique for modern distributed learning systems
to alleviate their communication bottlenecks over slower networks. Despite recent intensive …

被引用次数：8 相关文章所有 5 个版本

[PDF] mdpi.com

Recent developments in low-power AI accelerators: A survey

C Åleskog, H Grahn, A Borg - Algorithms, 2022 - mdpi.com

As machine learning and AI continue to rapidly develop, and with the ever-closer end of
Moore's law, new avenues and novel ideas in architecture design are being created and …

被引用次数：13 相关文章所有 4 个版本