Generative pretraining from pixels

M Chen, A Radford, R Child, J Wu… - International …, 2020 - proceedings.mlr.press
Inspired by progress in unsupervised representation learning for natural language, we
examine whether similar models can learn useful representations for images. We train a …

Analysis of {Large-Scale}{Multi-Tenant}{GPU} clusters for {DNN} training workloads

M Jeon, S Venkataraman, A Phanishayee… - 2019 USENIX Annual …, 2019 - usenix.org
With widespread advances in machine learning, a number of large enterprises are
beginning to incorporate machine learning models across a number of products. These …

To understand deep learning we need to understand kernel learning

M Belkin, S Ma, S Mandal - International Conference on …, 2018 - proceedings.mlr.press
Generalization performance of classifiers in deep learning has recently become a subject of
intense study. Deep models, which are typically heavily over-parametrized, tend to fit the …

Zico: Efficient {GPU} memory sharing for concurrent {DNN} training

G Lim, J Ahn, W Xiao, Y Kwon, M Jeon - 2021 USENIX Annual Technical …, 2021 - usenix.org
GPUs are the workhorse in modern server infrastructure fueling advances in a number of
compute-intensive workloads such as deep neural network (DNN) training. Several recent …

Shallow neural network with kernel approximation for prediction problems in highly demanding data networks

M Lopez-Martin, B Carro… - Expert Systems with …, 2019 - Elsevier
Intrusion detection and network traffic classification are two of the main research
applications of machine learning to highly demanding data networks eg IoT/sensors …

[PDF][PDF] Multi-tenant GPU clusters for deep learning workloads: Analysis and implications

M Jeon, S Venkataraman, J Qian… - Technical report …, 2018 - microsoft.com
With widespread advances in machine learning, a number of large enterprises are
beginning to incorporate machine learning models across a number of products. These …

Diving into the shallows: a computational perspective on large-scale shallow learning

S Ma, M Belkin - Advances in neural information processing …, 2017 - proceedings.neurips.cc
Remarkable recent success of deep neural networks has not been easy to analyze
theoretically. It has been particularly hard to disentangle relative significance of architecture …

Gaussian quadrature for kernel features

T Dao, CM De Sa, C Ré - Advances in neural information …, 2017 - proceedings.neurips.cc
Kernel methods have recently attracted resurgent interest, showing performance competitive
with deep neural networks in tasks such as speech recognition. The random Fourier features …

Quantum kitchen sinks: An algorithm for machine learning on near-term quantum computers

CM Wilson, JS Otterbach, N Tezak, RS Smith… - arXiv preprint arXiv …, 2018 - arxiv.org
Noisy intermediate-scale quantum computing devices are an exciting platform for the
exploration of the power of near-term quantum applications. Performing nontrivial tasks in …

Low-precision random Fourier features for memory-constrained kernel approximation

J Zhang, A May, T Dao, C Ré - The 22nd International …, 2019 - proceedings.mlr.press
We investigate how to train kernel approximation methods that generalize well under a
memory budget. Building on recent theoretical work, we define a measure of kernel …