Powerai ddl

R Buyya, SN Srirama, G Casale, R Calheiros… - ACM computing …, 2018 - dl.acm.org

The Cloud computing paradigm has revolutionised the computer science horizon during the
past decade and has enabled the emergence of computing as the fifth utility. It has captured …

被引用次数：463 相关文章所有 23 个版本

[PDF] ieee.org

Prottrans: Toward understanding the language of life through self-supervised learning

A Elnaggar, M Heinzinger, C Dallago… - IEEE transactions on …, 2021 - ieeexplore.ieee.org

Computational biology and bioinformatics provide vast data gold-mines from protein
sequences, ideal for Language Models (LMs) taken from Natural Language Processing …

被引用次数：1374 相关文章所有 17 个版本

[PDF] ieee.org

[PDF][PDF] ProtTrans: towards cracking the language of life's code through self-supervised learning

A Elnaggar, M Heinzinger, C Dallago… - IEEE transactions on …, 2021 - ieeexplore.ieee.org

Computational biology and bioinformatics provide vast data gold-mines from protein
sequences, ideal for Language Models taken from NLP. These LMs reach for new prediction …

被引用次数：153 相关文章

[PDF] arxiv.org

Large batch training of convolutional networks

Y You, I Gitman, B Ginsburg - arXiv preprint arXiv:1708.03888, 2017 - arxiv.org

The most natural way to speed-up the training of large networks is to use data-parallelism on
multiple GPUs. To scale Stochastic Gradient (SG) based methods to more processors, one …

被引用次数：839 相关文章所有 3 个版本

[PDF] arxiv.org

Drawing early-bird tickets: Towards more efficient training of deep networks

H You, C Li, P Xu, Y Fu, Y Wang, X Chen… - arXiv preprint arXiv …, 2019 - arxiv.org

(Frankle & Carbin, 2019) shows that there exist winning tickets (small but critical
subnetworks) for dense, randomly initialized networks, that can be trained alone to achieve …

被引用次数：275 相关文章所有 7 个版本

[PDF] amazonaws.com

The next generation of deep learning hardware: Analog computing

W Haensch, T Gokmen, R Puri - Proceedings of the IEEE, 2018 - ieeexplore.ieee.org

Initially developed for gaming and 3-D rendering, graphics processing units (GPUs) were
recognized to be a good fit to accelerate deep learning training. Its simple mathematical …

被引用次数：208 相关文章所有 4 个版本

[PDF] mlsys.org

Tictac: Accelerating distributed deep learning with communication scheduling

SH Hashemi, S Abdu Jyothi… - … of Machine Learning …, 2019 - proceedings.mlsys.org

State-of-the-art deep learning systems rely on iterative distributed training to tackle the
increasing complexity of models and input data. In this work, we identify an opportunity for …

被引用次数：214 相关文章所有 8 个版本

[PDF] aaai.org

Adacomp: Adaptive residual gradient compression for data-parallel distributed training

CY Chen, J Choi, D Brand, A Agrawal… - Proceedings of the …, 2018 - ojs.aaai.org

Highly distributed training of Deep Neural Networks (DNNs) on future compute platforms
(offering 100 of TeraOps/s of computational capacity) is expected to be severely …

被引用次数：200 相关文章所有 8 个版本

[PDF] arxiv.org

Anatomy of high-performance deep learning convolutions on simd architectures

E Georganas, S Avancha, K Banerjee… - … Conference for High …, 2018 - ieeexplore.ieee.org

Convolution layers are prevalent in many classes of deep neural networks, including
Convolutional Neural Networks (CNNs) which provide state-of-the-art results for tasks like …

被引用次数：138 相关文章所有 11 个版本

[PDF] neurips.cc

E2-train: Training state-of-the-art cnns with over 80% energy savings

Y Wang, Z Jiang, X Chen, P Xu… - Advances in Neural …, 2019 - proceedings.neurips.cc

Convolutional neural networks (CNNs) have been increasingly deployed to edge devices.
Hence, many efforts have been made towards efficient CNN inference on resource …

被引用次数：96 相关文章所有 10 个版本

A manifesto for future generation cloud computing: Research directions for the next decade

Prottrans: Toward understanding the language of life through self-supervised learning

[PDF][PDF] ProtTrans: towards cracking the language of life's code through self-supervised learning

Large batch training of convolutional networks

Drawing early-bird tickets: Towards more efficient training of deep networks

The next generation of deep learning hardware: Analog computing

Tictac: Accelerating distributed deep learning with communication scheduling

Adacomp: Adaptive residual gradient compression for data-parallel distributed training

Anatomy of high-performance deep learning convolutions on simd architectures

E2-train: Training state-of-the-art cnns with over 80% energy savings

高级搜索

引用