Trim: Enhancing processor-memory interfaces with scalable tensor reduction in memory

C Giannoula, I Fernandez, J Gómez-Luna… - ACM SIGMETRICS …, 2022 - dl.acm.org

Several manufacturers have already started to commercialize near-bank Processing-In-
Memory (PIM) architectures, after decades of research efforts. Near-bank PIM architectures …

被引用次数：45 相关文章所有 10 个版本

[PDF] arxiv.org

Smartsage: training large-scale graph neural networks using in-storage processing architectures

Y Lee, J Chung, M Rhu - Proceedings of the 49th Annual International …, 2022 - dl.acm.org

Graph neural networks (GNNs) can extract features by learning both the representation of
each objects (ie, graph nodes) and the relationship across different objects (ie, the edges …

被引用次数：47 相关文章所有 7 个版本

Evaluating machine learningworkloads on memory-centric computing systems

J Gómez-Luna, Y Guo, S Brocard… - … Analysis of Systems …, 2023 - ieeexplore.ieee.org

Training machine learning (ML) algorithms is a computationally intensive process, which is
frequently memory-bound due to repeatedly accessing large training datasets. As a result …

被引用次数：31 相关文章所有 3 个版本

[PDF] washington.edu

RAMBDA: RDMA-driven Acceleration Framework for Memory-intensive µs-scale Datacenter Applications

Y Yuan, J Huang, Y Sun, T Wang… - … Symposium on High …, 2023 - ieeexplore.ieee.org

Responding to the" datacenter tax" and" killer microseconds" problems for memory-intensive
datacenter applications, diverse solutions including Smart NIC-based ones have been …

被引用次数：26 相关文章所有 6 个版本

[PDF] cmu.edu

Dimm-link: Enabling efficient inter-dimm communication for near-memory processing

Z Zhou, C Li, F Yang, G Sun - 2023 IEEE International …, 2023 - ieeexplore.ieee.org

DIMM-based near-memory processing architectures (DIMM-NMP) have received growing
interest from both academia and industry. They have the advantages of large memory …

被引用次数：18 相关文章所有 3 个版本

[PDF] arxiv.org

Pathfinding Future PIM Architectures by Demystifying a Commercial PIM Technology

B Hyun, T Kim, D Lee, M Rhu - 2024 IEEE International …, 2024 - ieeexplore.ieee.org

Processing-in-memory (PIM) has been explored for decades by computer architects, yet it
has never seen the light of day in real-world products due to its high design overheads and …

被引用次数：19 相关文章所有 5 个版本

[PDF] arxiv.org

Training personalized recommendation systems from (GPU) scratch: Look forward not backwards

Y Kwon, M Rhu - Proceedings of the 49th Annual International …, 2022 - dl.acm.org

Personalized recommendation models (RecSys) are one of the most popular machine
learning workload serviced by hyperscalers. A critical challenge of training RecSys is its …

被引用次数：29 相关文章所有 7 个版本

[PDF] arxiv.org

Grow: A row-stationary sparse-dense gemm accelerator for memory-efficient graph convolutional neural networks

R Hwang, M Kang, J Lee, D Kam… - … Symposium on High …, 2023 - ieeexplore.ieee.org

Graph convolutional neural networks (GCNs) have emerged as a key technology in various
application domains where the input data is relational. A unique property of GCNs is that its …

被引用次数：31 相关文章所有 6 个版本

[PDF] arxiv.org

Accelerating weather prediction using near-memory reconfigurable fabric

G Singh, D Diamantopoulos, J Gómez-Luna… - ACM Transactions on …, 2022 - dl.acm.org

Ongoing climate change calls for fast and accurate weather and climate modeling. However,
when solving large-scale weather prediction simulations, state-of-the-art CPU and GPU …

被引用次数：30 相关文章所有 8 个版本

[PDF] arxiv.org

Mp-rec: Hardware-software co-design to enable multi-path recommendation

S Hsia, U Gupta, B Acun, N Ardalani, P Zhong… - Proceedings of the 28th …, 2023 - dl.acm.org

Deep learning recommendation systems serve personalized content under diverse tail-
latency targets and input-query loads. In order to do so, state-of-the-art recommendation …

被引用次数：15 相关文章所有 4 个版本