A survey of architectural approaches for data compression in cache and main memory systems
As the number of cores on a chip increases and key applications become even more data-
intensive, memory systems in modern processors have to deal with increasingly large …
intensive, memory systems in modern processors have to deal with increasingly large …
Approximate computing: A survey
As one of the most promising energy-efficient computing paradigms, approximate computing
has gained a lot of research attention in the past few years. This paper presents a survey of …
has gained a lot of research attention in the past few years. This paper presents a survey of …
Approximate communication: Techniques for reducing communication bottlenecks in large-scale parallel systems
F Betzel, K Khatamifard, H Suresh, DJ Lilja… - ACM Computing …, 2018 - dl.acm.org
Approximate computing has gained research attention recently as a way to increase energy
efficiency and/or performance by exploiting some applications' intrinsic error resiliency …
efficiency and/or performance by exploiting some applications' intrinsic error resiliency …
GPUWattch: Enabling energy optimizations in GPGPUs
General-purpose GPUs (GPGPUs) are becoming prevalent in mainstream computing, and
performance per watt has emerged as a more crucial evaluation metric than peak …
performance per watt has emerged as a more crucial evaluation metric than peak …
Compressing DMA engine: Leveraging activation sparsity for training deep neural networks
Popular deep learning frameworks require users to fine-tune their memory usage so that the
training data of a deep neural network (DNN) fits within the GPU physical memory. Prior …
training data of a deep neural network (DNN) fits within the GPU physical memory. Prior …
Linearly compressed pages: A low-complexity, low-latency main memory compression framework
Data compression is a promising approach for meeting the increasing memory capacity
demands expected in future systems. Unfortunately, existing compression algorithms do not …
demands expected in future systems. Unfortunately, existing compression algorithms do not …
Warped-compression: Enabling power efficient GPUs through register compression
This paper presents Warped-Compression, a warp-level register compression scheme for
reducing GPU power consumption. This work is motivated by the observation that the …
reducing GPU power consumption. This work is motivated by the observation that the …
People, penguins and petri dishes: Adapting object counting models to new visual domains and object types without forgetting
In this paper we propose a technique to adapt a convolutional neural network (CNN) based
object counter to additional visual domains and object types while still preserving the …
object counter to additional visual domains and object types while still preserving the …
A framework for memory oversubscription management in graphics processing units
C Li, R Ausavarungnirun, CJ Rossbach… - Proceedings of the …, 2019 - dl.acm.org
Modern discrete GPUs support unified memory and demand paging. Automatic
management of data movement between CPU memory and GPU memory dramatically …
management of data movement between CPU memory and GPU memory dramatically …
A case for core-assisted bottleneck acceleration in GPUs: enabling flexible data compression with assist warps
Modern Graphics Processing Units (GPUs) are well provisioned to support the concurrent
execution of thousands of threads. Unfortunately, different bottlenecks during execution and …
execution of thousands of threads. Unfortunately, different bottlenecks during execution and …