A survey of architectural approaches for data compression in cache and main memory systems

S Mittal, JS Vetter - IEEE Transactions on Parallel and …, 2015 - ieeexplore.ieee.org
As the number of cores on a chip increases and key applications become even more data-
intensive, memory systems in modern processors have to deal with increasingly large …

Approximate computing: A survey

Q Xu, T Mytkowicz, NS Kim - IEEE Design & Test, 2015 - ieeexplore.ieee.org
As one of the most promising energy-efficient computing paradigms, approximate computing
has gained a lot of research attention in the past few years. This paper presents a survey of …

Approximate communication: Techniques for reducing communication bottlenecks in large-scale parallel systems

F Betzel, K Khatamifard, H Suresh, DJ Lilja… - ACM Computing …, 2018 - dl.acm.org
Approximate computing has gained research attention recently as a way to increase energy
efficiency and/or performance by exploiting some applications' intrinsic error resiliency …

GPUWattch: Enabling energy optimizations in GPGPUs

J Leng, T Hetherington, A ElTantawy, S Gilani… - ACM SIGARCH …, 2013 - dl.acm.org
General-purpose GPUs (GPGPUs) are becoming prevalent in mainstream computing, and
performance per watt has emerged as a more crucial evaluation metric than peak …

Compressing DMA engine: Leveraging activation sparsity for training deep neural networks

M Rhu, M O'Connor, N Chatterjee… - … Symposium on High …, 2018 - ieeexplore.ieee.org
Popular deep learning frameworks require users to fine-tune their memory usage so that the
training data of a deep neural network (DNN) fits within the GPU physical memory. Prior …

Linearly compressed pages: A low-complexity, low-latency main memory compression framework

G Pekhimenko, V Seshadri, Y Kim, H Xin… - Proceedings of the 46th …, 2013 - dl.acm.org
Data compression is a promising approach for meeting the increasing memory capacity
demands expected in future systems. Unfortunately, existing compression algorithms do not …

Warped-compression: Enabling power efficient GPUs through register compression

S Lee, K Kim, G Koo, H Jeon, WW Ro… - ACM SIGARCH …, 2015 - dl.acm.org
This paper presents Warped-Compression, a warp-level register compression scheme for
reducing GPU power consumption. This work is motivated by the observation that the …

People, penguins and petri dishes: Adapting object counting models to new visual domains and object types without forgetting

M Marsden, K McGuinness, S Little… - Proceedings of the …, 2018 - openaccess.thecvf.com
In this paper we propose a technique to adapt a convolutional neural network (CNN) based
object counter to additional visual domains and object types while still preserving the …

A framework for memory oversubscription management in graphics processing units

C Li, R Ausavarungnirun, CJ Rossbach… - Proceedings of the …, 2019 - dl.acm.org
Modern discrete GPUs support unified memory and demand paging. Automatic
management of data movement between CPU memory and GPU memory dramatically …

A case for core-assisted bottleneck acceleration in GPUs: enabling flexible data compression with assist warps

N Vijaykumar, G Pekhimenko, A Jog… - ACM SIGARCH …, 2015 - dl.acm.org
Modern Graphics Processing Units (GPUs) are well provisioned to support the concurrent
execution of thousands of threads. Unfortunately, different bottlenecks during execution and …