Latte-cc: Latency tolerance aware adaptive cache compression management for energy efficient gpus

A Arunkumar, SY Lee… - … Symposium on High …, 2018 - ieeexplore.ieee.org
General-purpose GPU applications are significantly constrained by the efficiency of the
memory subsystem and the availability of data cache capacity on GPUs. Cache …

A case for core-assisted bottleneck acceleration in GPUs: enabling flexible data compression with assist warps

N Vijaykumar, G Pekhimenko, A Jog… - ACM SIGARCH …, 2015 - dl.acm.org
Modern Graphics Processing Units (GPUs) are well provisioned to support the concurrent
execution of thousands of threads. Unfortunately, different bottlenecks during execution and …

Compressed L1 data cache and L2 cache in GPGPUs

E Atoofian - 2016 IEEE 27th International Conference on …, 2016 - ieeexplore.ieee.org
General-Purpose Graphics Processing Units (GPGPUs) exploit several levels of caches to
hide latency of memory and provide data for thousands of simultaneously executing threads …

Exploiting adaptive data compression to improve performance and energy-efficiency of compute workloads in multi-GPU systems

MK Tavana, Y Sun, NB Agostini… - 2019 IEEE International …, 2019 - ieeexplore.ieee.org
Graphics Processing Unit (GPU) performance has relied heavily on our ability to scale of
number of transistors on chip, in order to satisfy the ever-increasing demands for more …

E^ 2MC: Entropy Encoding Based Memory Compression for GPUs

S Lal, J Lucas, B Juurlink - 2017 IEEE International Parallel and …, 2017 - ieeexplore.ieee.org
Modern Graphics Processing Units (GPUs) provide much higher off-chip memory bandwidth
than CPUs, but many GPU applications are still limited by memory bandwidth. Unfortunately …

SLC: Memory access granularity aware selective lossy compression for GPUs

S Lal, J Lucas, B Juurlink - 2019 Design, Automation & Test in …, 2019 - ieeexplore.ieee.org
Memory compression is a promising approach for reducing memory bandwidth
requirements and increasing performance, however, memory compression techniques often …

Toggle-aware compression for GPUs

G Pekhimenko, E Bolotin, M O'Connor… - IEEE Computer …, 2015 - ieeexplore.ieee.org
Memory bandwidth compression can be an effective way to achieve higher system
performance and energy efficiency in modern data-intensive applications by exploiting …

A case for toggle-aware compression for GPU systems

G Pekhimenko, E Bolotin, N Vijaykumar… - … Symposium on High …, 2016 - ieeexplore.ieee.org
Data compression can be an effective method to achieve higher system performance and
energy efficiency in modern data-intensive applications by exploiting redundancy and data …

IACM: Integrated adaptive cache management for high-performance and energy-efficient GPGPU computing

KY Kim, J Park, W Baek - 2016 IEEE 34th International …, 2016 - ieeexplore.ieee.org
Hardware caches are widely employed in GPGPUs to achieve higher performance and
energy efficiency. Incorporating hardware caches in GPGPUs, however, does not …

A Framework for Accelerating Bottlenecks in GPU Execution with Assist Warps

N Vijaykumar, G Pekhimenko, A Jog, S Ghose… - arXiv preprint arXiv …, 2016 - arxiv.org
Modern Graphics Processing Units (GPUs) are well provisioned to support the concurrent
execution of thousands of threads. Unfortunately, different bottlenecks during execution and …