A manifesto for future generation cloud computing: Research directions for the next decade
The Cloud computing paradigm has revolutionised the computer science horizon during the
past decade and has enabled the emergence of computing as the fifth utility. It has captured …
past decade and has enabled the emergence of computing as the fifth utility. It has captured …
Prottrans: Toward understanding the language of life through self-supervised learning
Computational biology and bioinformatics provide vast data gold-mines from protein
sequences, ideal for Language Models (LMs) taken from Natural Language Processing …
sequences, ideal for Language Models (LMs) taken from Natural Language Processing …
[PDF][PDF] ProtTrans: towards cracking the language of life's code through self-supervised learning
A Elnaggar, M Heinzinger, C Dallago… - IEEE transactions on …, 2021 - ieeexplore.ieee.org
Computational biology and bioinformatics provide vast data gold-mines from protein
sequences, ideal for Language Models taken from NLP. These LMs reach for new prediction …
sequences, ideal for Language Models taken from NLP. These LMs reach for new prediction …
Large batch training of convolutional networks
The most natural way to speed-up the training of large networks is to use data-parallelism on
multiple GPUs. To scale Stochastic Gradient (SG) based methods to more processors, one …
multiple GPUs. To scale Stochastic Gradient (SG) based methods to more processors, one …
Drawing early-bird tickets: Towards more efficient training of deep networks
(Frankle & Carbin, 2019) shows that there exist winning tickets (small but critical
subnetworks) for dense, randomly initialized networks, that can be trained alone to achieve …
subnetworks) for dense, randomly initialized networks, that can be trained alone to achieve …
The next generation of deep learning hardware: Analog computing
Initially developed for gaming and 3-D rendering, graphics processing units (GPUs) were
recognized to be a good fit to accelerate deep learning training. Its simple mathematical …
recognized to be a good fit to accelerate deep learning training. Its simple mathematical …
Tictac: Accelerating distributed deep learning with communication scheduling
SH Hashemi, S Abdu Jyothi… - … of Machine Learning …, 2019 - proceedings.mlsys.org
State-of-the-art deep learning systems rely on iterative distributed training to tackle the
increasing complexity of models and input data. In this work, we identify an opportunity for …
increasing complexity of models and input data. In this work, we identify an opportunity for …
Adacomp: Adaptive residual gradient compression for data-parallel distributed training
Highly distributed training of Deep Neural Networks (DNNs) on future compute platforms
(offering 100 of TeraOps/s of computational capacity) is expected to be severely …
(offering 100 of TeraOps/s of computational capacity) is expected to be severely …
Anatomy of high-performance deep learning convolutions on simd architectures
Convolution layers are prevalent in many classes of deep neural networks, including
Convolutional Neural Networks (CNNs) which provide state-of-the-art results for tasks like …
Convolutional Neural Networks (CNNs) which provide state-of-the-art results for tasks like …
E2-train: Training state-of-the-art cnns with over 80% energy savings
Convolutional neural networks (CNNs) have been increasingly deployed to edge devices.
Hence, many efforts have been made towards efficient CNN inference on resource …
Hence, many efforts have been made towards efficient CNN inference on resource …