SIMDRAM: A framework for bit-serial SIMD processing using DRAM
N Hajinazar, GF Oliveira, S Gregorio… - Proceedings of the 26th …, 2021 - dl.acm.org
Processing-using-DRAM has been proposed for a limited set of basic operations (ie, logic
operations, addition). However, in order to enable full adoption of processing-using-DRAM …
operations, addition). However, in order to enable full adoption of processing-using-DRAM …
DAMOV: A new methodology and benchmark suite for evaluating data movement bottlenecks
Data movement between the CPU and main memory is a first-order obstacle against improv
ing performance, scalability, and energy efficiency in modern systems. Computer systems …
ing performance, scalability, and energy efficiency in modern systems. Computer systems …
A survey on deep learning hardware accelerators for heterogeneous hpc platforms
Recent trends in deep learning (DL) imposed hardware accelerators as the most viable
solution for several classes of high-performance computing (HPC) applications such as …
solution for several classes of high-performance computing (HPC) applications such as …
Casper: Accelerating stencil computations using near-cache processing
Stencil computations are commonly used in a wide variety of scientific applications, ranging
from large-scale weather prediction to solving partial differential equations. Stencil …
from large-scale weather prediction to solving partial differential equations. Stencil …
pluto: Enabling massively parallel computation in dram via lookup tables
Data movement between the main memory and the processor is a key contributor to
execution time and energy consumption in memory-intensive applications. This data …
execution time and energy consumption in memory-intensive applications. This data …
MIMDRAM: An End-to-End Processing-Using-DRAM System for High-Throughput, Energy-Efficient and Programmer-Transparent Multiple-Instruction Multiple-Data …
Processing-using-DRAM (PUD) is a processing-in-memory (PIM) approach that uses a
DRAM array's massive internal parallelism to execute very-wide (eg, 16,384-262,144-bit …
DRAM array's massive internal parallelism to execute very-wide (eg, 16,384-262,144-bit …
Survey on memory management techniques in heterogeneous computing systems
A Hazarika, S Poddar… - IET Computers & Digital …, 2020 - Wiley Online Library
A major issue faced by data scientists today is how to scale up their processing infrastructure
to meet the challenge of big data and high‐performance computing (HPC) workloads. With …
to meet the challenge of big data and high‐performance computing (HPC) workloads. With …
Polynesia: Enabling High-Performance and Energy-Efficient Hybrid Transactional/Analytical Databases with Hardware/Software Co-Design
A growth in data volume, combined with increasing demand for real-time analysis (using the
most recent data), has resulted in the emergence of database systems that concurrently …
most recent data), has resulted in the emergence of database systems that concurrently …
ALP: Alleviating CPU-memory data movement overheads in memory-centric systems
Partitioning applications between near-data processing (NDP) and host CPU cores causes
inter-segment data movement overhead, which is caused by moving data generated by one …
inter-segment data movement overhead, which is caused by moving data generated by one …
Simultaneous Many-Row Activation in Off-the-Shelf DRAM Chips: Experimental Characterization and Analysis
We experimentally analyze the computational capability of commercial off-the-shelf (COTS)
DRAM chips and the robustness of these capabilities under various timing delays between …
DRAM chips and the robustness of these capabilities under various timing delays between …