Compressed L1 data cache and L2 cache in GPGPUs
E Atoofian - 2016 IEEE 27th International Conference on …, 2016 - ieeexplore.ieee.org
General-Purpose Graphics Processing Units (GPGPUs) exploit several levels of caches to
hide latency of memory and provide data for thousands of simultaneously executing threads …
hide latency of memory and provide data for thousands of simultaneously executing threads …
Reducing static and dynamic power of l1 data caches in gpgpus
E Atoofian - 2014 IEEE International Parallel & Distributed …, 2014 - ieeexplore.ieee.org
With the widespread adoption of GPGPUs for general purpose computing domain, the size
of GPGPUs has grown quickly, making power consumption a major bottleneck. L1 data …
of GPGPUs has grown quickly, making power consumption a major bottleneck. L1 data …
Many-thread aware compression in GPGPUs
E Atoofian - 2016 Intl IEEE Conferences on Ubiquitous …, 2016 - ieeexplore.ieee.org
Compression is a promising technique to increase effective capacity of caches. Due to
latency overhead of decompression, most of previous studies applied compression to lower …
latency overhead of decompression, most of previous studies applied compression to lower …
Dual dictionary compression for the last level cache
The performance of GPUs is rapidly improving as the top GPU vendors keep pushing the
boundaries of process technologies. While larger die sizes help improve performance given …
boundaries of process technologies. While larger die sizes help improve performance given …
Approximate cache in GPGPUs
E Atoofian - ACM Transactions on Embedded Computing Systems …, 2020 - dl.acm.org
There is a growing number of application domains ranging from multimedia to machine
learning where a certain level of inexactness can be tolerated. For these applications …
learning where a certain level of inexactness can be tolerated. For these applications …
Latte-cc: Latency tolerance aware adaptive cache compression management for energy efficient gpus
A Arunkumar, SY Lee… - … Symposium on High …, 2018 - ieeexplore.ieee.org
General-purpose GPU applications are significantly constrained by the efficiency of the
memory subsystem and the availability of data cache capacity on GPUs. Cache …
memory subsystem and the availability of data cache capacity on GPUs. Cache …
Data-type specific cache compression in GPGPUs
E Atoofian, S Rea - The Journal of Supercomputing, 2018 - Springer
In this paper, we evaluate compressibility of L1 data caches and L2 cache in general-
purpose graphics processing units (GPGPUs). Our proposed scheme is geared toward …
purpose graphics processing units (GPGPUs). Our proposed scheme is geared toward …
Coordinated static and dynamic cache bypassing for GPUs
The massive parallel architecture enables graphics processing units (GPUs) to boost
performance for a wide range of applications. Initially, GPUs only employ scratchpad …
performance for a wide range of applications. Initially, GPUs only employ scratchpad …
ID-cache: instruction and memory divergence based cache management for GPUs
A Arunkumar, SY Lee, CJ Wu - 2016 IEEE international …, 2016 - ieeexplore.ieee.org
Modern graphic processing units (GPUs) are not only able to perform graphics rendering,
but also perform general purpose parallel computations (GPGPUs). It has been shown that …
but also perform general purpose parallel computations (GPGPUs). It has been shown that …
Ctrl-C: Instruction-aware control loop based adaptive cache bypassing for GPUs
SY Lee, CJ Wu - 2016 IEEE 34th International Conference on …, 2016 - ieeexplore.ieee.org
The performance of general-purpose graphics processing units (GPGPUs) is often limited by
the efficiency of the memory subsystems, particularly the L1 data caches. Because of the …
the efficiency of the memory subsystems, particularly the L1 data caches. Because of the …