Compressed L1 data cache and L2 cache in GPGPUs

E Atoofian - 2016 IEEE 27th International Conference on …, 2016 - ieeexplore.ieee.org
General-Purpose Graphics Processing Units (GPGPUs) exploit several levels of caches to
hide latency of memory and provide data for thousands of simultaneously executing threads …

Reducing static and dynamic power of l1 data caches in gpgpus

E Atoofian - 2014 IEEE International Parallel & Distributed …, 2014 - ieeexplore.ieee.org
With the widespread adoption of GPGPUs for general purpose computing domain, the size
of GPGPUs has grown quickly, making power consumption a major bottleneck. L1 data …

Many-thread aware compression in GPGPUs

E Atoofian - 2016 Intl IEEE Conferences on Ubiquitous …, 2016 - ieeexplore.ieee.org
Compression is a promising technique to increase effective capacity of caches. Due to
latency overhead of decompression, most of previous studies applied compression to lower …

Dual dictionary compression for the last level cache

A Lahiry, D Kaeli - 2017 IEEE International Conference on …, 2017 - ieeexplore.ieee.org
The performance of GPUs is rapidly improving as the top GPU vendors keep pushing the
boundaries of process technologies. While larger die sizes help improve performance given …

Approximate cache in GPGPUs

E Atoofian - ACM Transactions on Embedded Computing Systems …, 2020 - dl.acm.org
There is a growing number of application domains ranging from multimedia to machine
learning where a certain level of inexactness can be tolerated. For these applications …

Latte-cc: Latency tolerance aware adaptive cache compression management for energy efficient gpus

A Arunkumar, SY Lee… - … Symposium on High …, 2018 - ieeexplore.ieee.org
General-purpose GPU applications are significantly constrained by the efficiency of the
memory subsystem and the availability of data cache capacity on GPUs. Cache …

Data-type specific cache compression in GPGPUs

E Atoofian, S Rea - The Journal of Supercomputing, 2018 - Springer
In this paper, we evaluate compressibility of L1 data caches and L2 cache in general-
purpose graphics processing units (GPGPUs). Our proposed scheme is geared toward …

Coordinated static and dynamic cache bypassing for GPUs

X Xie, Y Liang, Y Wang, G Sun… - 2015 IEEE 21st …, 2015 - ieeexplore.ieee.org
The massive parallel architecture enables graphics processing units (GPUs) to boost
performance for a wide range of applications. Initially, GPUs only employ scratchpad …

ID-cache: instruction and memory divergence based cache management for GPUs

A Arunkumar, SY Lee, CJ Wu - 2016 IEEE international …, 2016 - ieeexplore.ieee.org
Modern graphic processing units (GPUs) are not only able to perform graphics rendering,
but also perform general purpose parallel computations (GPGPUs). It has been shown that …

Ctrl-C: Instruction-aware control loop based adaptive cache bypassing for GPUs

SY Lee, CJ Wu - 2016 IEEE 34th International Conference on …, 2016 - ieeexplore.ieee.org
The performance of general-purpose graphics processing units (GPGPUs) is often limited by
the efficiency of the memory subsystems, particularly the L1 data caches. Because of the …