A survey of architectural approaches for data compression in cache and main memory systems
As the number of cores on a chip increases and key applications become even more data-
intensive, memory systems in modern processors have to deal with increasingly large …
intensive, memory systems in modern processors have to deal with increasingly large …
Approximate communication: Techniques for reducing communication bottlenecks in large-scale parallel systems
F Betzel, K Khatamifard, H Suresh, DJ Lilja… - ACM Computing …, 2018 - dl.acm.org
Approximate computing has gained research attention recently as a way to increase energy
efficiency and/or performance by exploiting some applications' intrinsic error resiliency …
efficiency and/or performance by exploiting some applications' intrinsic error resiliency …
Base-delta-immediate compression: Practical data compression for on-chip caches
Cache compression is a promising technique to increase on-chip cache capacity and to
decrease on-chip and off-chip bandwidth usage. Unfortunately, directly applying well-known …
decrease on-chip and off-chip bandwidth usage. Unfortunately, directly applying well-known …
Design and evaluation of a hierarchical on-chip interconnect for next-generation CMPs
Performance and power consumption of an on-chip interconnect that forms the backbone of
chip multiprocessors (CMPs), are directly influenced by the underlying network topology …
chip multiprocessors (CMPs), are directly influenced by the underlying network topology …
APPROX-NoC: A data approximation framework for network-on-chip architectures
The trend of unsustainable power consumption and large memory bandwidth demands in
massively parallel multicore systems, with the advent of the big data era, has brought upon …
massively parallel multicore systems, with the advent of the big data era, has brought upon …
Characterizing and mitigating the impact of process variations on phase change based memory systems
W Zhang, T Li - Proceedings of the 42nd Annual IEEE/ACM …, 2009 - dl.acm.org
Dynamic Random Access Memory (DRAM) has been used in main memory design for
decades. However, DRAM consumes an increasing power budget and faces difficulties in …
decades. However, DRAM consumes an increasing power budget and faces difficulties in …
Decoupled compressed cache: Exploiting spatial locality for energy-optimized compressed caching
S Sardashti, DA Wood - Proceedings of the 46th Annual IEEE/ACM …, 2013 - dl.acm.org
In multicore processor systems, last-level caches (LLCs) play a crucial role in reducing
system energy by i) filtering out expensive accesses to main memory and ii) reducing the …
system energy by i) filtering out expensive accesses to main memory and ii) reducing the …
A case for toggle-aware compression for GPU systems
Data compression can be an effective method to achieve higher system performance and
energy efficiency in modern data-intensive applications by exploiting redundancy and data …
energy efficiency in modern data-intensive applications by exploiting redundancy and data …
A case for dynamic frequency tuning in on-chip networks
Performance and power are the first order design metrics for Network-on-Chips (NoCs) that
have become the de-facto standard in providing scalable communication backbones for …
have become the de-facto standard in providing scalable communication backbones for …
Whole packet forwarding: Efficient design of fully adaptive routing algorithms for networks-on-chip
Routing algorithms for networks-on-chip (NoCs) typically only have a small number of virtual
channels (VCs) at their disposal. Limited VCs pose several challenges to the design of fully …
channels (VCs) at their disposal. Limited VCs pose several challenges to the design of fully …