Sustainable ai: Environmental implications, challenges and opportunities
CJ Wu, R Raghavendra, U Gupta… - Proceedings of …, 2022 - proceedings.mlsys.org
This paper explores the environmental impact of the super-linear growth trends for AI from a
holistic perspective, spanning Data, Algorithms, and System Hardware. We characterize the …
holistic perspective, spanning Data, Algorithms, and System Hardware. We characterize the …
Transformers in vision: A survey
Astounding results from Transformer models on natural language tasks have intrigued the
vision community to study their application to computer vision problems. Among their salient …
vision community to study their application to computer vision problems. Among their salient …
Gpt3. int8 (): 8-bit matrix multiplication for transformers at scale
Large language models have been widely adopted but require significant GPU memory for
inference. We develop a procedure for Int8 matrix multiplication for feed-forward and …
inference. We develop a procedure for Int8 matrix multiplication for feed-forward and …
Tip-adapter: Training-free clip-adapter for better vision-language modeling
Contrastive Vision-Language Pre-training, known as CLIP, has provided a new paradigm for
learning visual representations by using large-scale contrastive image-text pairs. It shows …
learning visual representations by using large-scale contrastive image-text pairs. It shows …
Resmlp: Feedforward networks for image classification with data-efficient training
We present ResMLP, an architecture built entirely upon multi-layer perceptrons for image
classification. It is a simple residual network that alternates (i) a linear layer in which image …
classification. It is a simple residual network that alternates (i) a linear layer in which image …
Medical transformer: Gated axial-attention for medical image segmentation
JMJ Valanarasu, P Oza, I Hacihaliloglu… - Medical image computing …, 2021 - Springer
Over the past decade, deep convolutional neural networks have been widely adopted for
medical image segmentation and shown to achieve adequate performance. However, due …
medical image segmentation and shown to achieve adequate performance. However, due …
[PDF][PDF] Is space-time attention all you need for video understanding?
Training. We train our model for 15 epochs with an initial learning rate of 0.005, which is
divided by 10 at epochs 11, and 14. During training, we first resize the shorter side of the …
divided by 10 at epochs 11, and 14. During training, we first resize the shorter side of the …
Differentially private fine-tuning of language models
We give simpler, sparser, and faster algorithms for differentially private fine-tuning of large-
scale pre-trained language models, which achieve the state-of-the-art privacy versus utility …
scale pre-trained language models, which achieve the state-of-the-art privacy versus utility …
wav2vec 2.0: A framework for self-supervised learning of speech representations
We show for the first time that learning powerful representations from speech audio alone
followed by fine-tuning on transcribed speech can outperform the best semi-supervised …
followed by fine-tuning on transcribed speech can outperform the best semi-supervised …
Linformer: Self-attention with linear complexity
Large transformer models have shown extraordinary success in achieving state-of-the-art
results in many natural language processing applications. However, training and deploying …
results in many natural language processing applications. However, training and deploying …