A survey of performance tuning techniques and tools for parallel applications
D Mustafa - IEEE Access, 2022 - ieeexplore.ieee.org
Automatic parallelization of sequential programs combined with auto-tuning is an alternative
to manual parallelization. With wider research directions and the increased number of …
to manual parallelization. With wider research directions and the increased number of …
The Landscape of Compute-near-memory and Compute-in-memory: A Research and Commercial Overview
In today's data-centric world, where data fuels numerous application domains, with machine
learning at the forefront, handling the enormous volume of data efficiently in terms of time …
learning at the forefront, handling the enormous volume of data efficiently in terms of time …
Bring memristive in-memory computing into general-purpose machine learning: A perspective
H Zhou, J Chen, J Li, L Yang, Y Li, X Miao - APL Machine Learning, 2023 - pubs.aip.org
In-memory computing (IMC) using emerging nonvolatile devices has received considerable
attention due to its great potential for accelerating artificial neural networks and machine …
attention due to its great potential for accelerating artificial neural networks and machine …
C4CAM: A Compiler for CAM-based In-memory Accelerators
Machine learning and data analytics applications increasingly suffer from the high latency
and energy consumption of conventional von Neumann architectures. Recently, several in …
and energy consumption of conventional von Neumann architectures. Recently, several in …
Special Session-Non-Volatile Memories: Challenges and Opportunities for Embedded System Architectures with Focus on Machine Learning Applications
This paper explores the challenges and opportunities of integrating non-volatile memories
(NVMs) into embedded systems for machine learning. NVMs offer advantages such as …
(NVMs) into embedded systems for machine learning. NVMs offer advantages such as …
Cim-mlc: A multi-level compilation stack for computing-in-memory accelerators
In recent years, various computing-in-memory (CIM) processors have been presented,
showing superior performance over traditional architectures. To unleash the potential of …
showing superior performance over traditional architectures. To unleash the potential of …
SongC: A compiler for hybrid near-memory and in-memory many-core architecture
Building hybrid systems that incorporate various processing-in-memory (PIM) devices and
processing-near-memory (PNM) technologies can offer complementary advantages in both …
processing-near-memory (PNM) technologies can offer complementary advantages in both …
XLA-NDP: Efficient Scheduling and Code Generation for Deep Learning Model Training on Near-Data Processing Memory
J Park, H Sung - IEEE Computer Architecture Letters, 2023 - ieeexplore.ieee.org
Deep learning (DL) model training must address the memory bottleneck to continue scaling.
Processing-in-memory approaches can be a viable solution as they move computations …
Processing-in-memory approaches can be a viable solution as they move computations …
Cinm (cinnamon): A compilation infrastructure for heterogeneous compute in-memory and compute near-memory paradigms
The rise of data-intensive applications exposed the limitations of conventional processor-
centric von-Neumann architectures that struggle to meet the off-chip memory bandwidth …
centric von-Neumann architectures that struggle to meet the off-chip memory bandwidth …
Smoothing Disruption Across the Stack: Tales of Memory, Heterogeneity, & Compilers
Multiple research vectors represent possible paths to improved energy and performance
metrics at the application-level. There are active efforts with respect to emerging logic …
metrics at the application-level. There are active efforts with respect to emerging logic …