System, method, and computer program product for improving memory systems
MS Smith - US Patent 9,432,298, 2016 - Google Patents
H01L25/18—Assemblies consisting of a plurality of individual semiconductor or other solid
state devices; Multistep manufacturing processes thereof the devices being of types …
state devices; Multistep manufacturing processes thereof the devices being of types …
Transparent offloading and mapping (TOM) enabling programmer-transparent near-data processing in GPU systems
Main memory bandwidth is a critical bottleneck for modern GPU systems due to limited off-
chip pin bandwidth. 3D-stacked memory architectures provide a promising opportunity to …
chip pin bandwidth. 3D-stacked memory architectures provide a promising opportunity to …
[图书][B] Memory systems: cache, DRAM, disk
B Jacob, D Wang, S Ng - 2010 - books.google.com
Is your memory hierarchy stopping your microprocessor from performing at the high level it
should be? Memory Systems: Cache, DRAM, Disk shows you how to resolve this problem …
should be? Memory Systems: Cache, DRAM, Disk shows you how to resolve this problem …
A case for exploiting subarray-level parallelism (SALP) in DRAM
Modern DRAMs have multiple banks to serve multiple memory requests in parallel.
However, when two requests go to the same bank, they have to be served serially …
However, when two requests go to the same bank, they have to be served serially …
Self-optimizing memory controllers: A reinforcement learning approach
Efficiently utilizing off-chip DRAM bandwidth is a critical issuein designing cost-effective,
high-performance chip multiprocessors (CMPs). Conventional memory controllers deliver …
high-performance chip multiprocessors (CMPs). Conventional memory controllers deliver …
Parallelism-aware batch scheduling: Enhancing both performance and fairness of shared DRAM systems
O Mutlu, T Moscibroda - ACM SIGARCH Computer Architecture News, 2008 - dl.acm.org
In a chip-multiprocessor (CMP) system, the DRAM system isshared among cores. In a
shared DRAM system, requests from athread can not only delay requests from other threads …
shared DRAM system, requests from athread can not only delay requests from other threads …
ATLAS: A scalable and high-performance scheduling algorithm for multiple memory controllers
Modern chip multiprocessor (CMP) systems employ multiple memory controllers to control
access to main memory. The scheduling algorithm employed by these memory controllers …
access to main memory. The scheduling algorithm employed by these memory controllers …
Data reorganization in memory using 3D-stacked DRAM
In this paper we focus on common data reorganization operations such as shuffle,
pack/unpack, swap, transpose, and layout transformations. Although these operations …
pack/unpack, swap, transpose, and layout transformations. Although these operations …
Stall-time fair memory access scheduling for chip multiprocessors
O Mutlu, T Moscibroda - 40th Annual IEEE/ACM International …, 2007 - ieeexplore.ieee.org
DRAM memory is a major resource shared among cores in a chip multiprocessor (CMP)
system. Memory requests from different threads can interfere with each other. Existing …
system. Memory requests from different threads can interfere with each other. Existing …
ChargeCache: Reducing DRAM latency by exploiting row access locality
DRAM latency continues to be a critical bottleneck for system performance. In this work, we
develop a low-cost mechanism, called Charge Cache, that enables faster access to recently …
develop a low-cost mechanism, called Charge Cache, that enables faster access to recently …