OSCAR: Orchestrating STT-RAM cache traffic for heterogeneous CPU-GPU architectures

M Shahrad, J Balkind, D Wentzlaff - … of the 52nd annual IEEE/ACM …, 2019 - dl.acm.org

Serverless computing is a rapidly growing cloud application model, popularized by
Amazon's Lambda platform. Serverless cloud services provide fine-grained provisioning of …

被引用次数：227 相关文章所有 9 个版本

[PDF] persper.org

Crash consistency in encrypted non-volatile main memory systems

S Liu, A Kolli, J Ren, S Khan - 2018 IEEE International …, 2018 - ieeexplore.ieee.org

Non-Volatile Main Memory (NVMM) systems provide high performance by directly
manipulating persistent data in-memory, but require crash consistency support to recover …

被引用次数：107 相关文章所有 6 个版本

[PDF] nsf.gov

Adapt-noc: A flexible network-on-chip design for heterogeneous manycore architectures

H Zheng, K Wang, A Louri - 2021 IEEE international symposium …, 2021 - ieeexplore.ieee.org

The increased computational capability in heterogeneous manycore architectures facilitates
the concurrent execution of many applications. This requires, among other things, a flexible …

被引用次数：40 相关文章所有 4 个版本

Density tradeoffs of non-volatile memory as a replacement for SRAM based last level cache

K Korgaonkar, I Bhati, H Liu, J Gaur… - 2018 ACM/IEEE 45th …, 2018 - ieeexplore.ieee.org

Increasing the capacity of the Last Level Cache (LLC) can help scale the memory wall. Due
to prohibitive area and leakage power, however, growing conventional SRAM LLC already …

被引用次数：68 相关文章所有 6 个版本

[PDF] acm.org

Opportunistic computing in gpu architectures

A Pattnaik, X Tang, O Kayiran, A Jog, A Mishra… - Proceedings of the 46th …, 2019 - dl.acm.org

Data transfer overhead between computing cores and memory hierarchy has been a
persistent issue for von Neumann architectures and the problem has only become more …

被引用次数：52 相关文章所有 13 个版本

[PDF] acm.org

Job scheduling for large-scale machine learning clusters

H Wang, Z Liu, H Shen - … of the 16th International Conference on …, 2020 - dl.acm.org

With the rapid proliferation of Machine Learning (ML) and Deep learning (DL) applications
running on modern platforms, it is crucial to satisfy application performance requirements …

被引用次数：28 相关文章所有 2 个版本

[PDF] purdue.edu

AMPT-GA: automatic mixed precision floating point tuning for GPU applications

PV Kotipalli, R Singh, P Wood, I Laguna… - Proceedings of the ACM …, 2019 - dl.acm.org

Mixed precision computations improve high performance computing throughput for
applications that can tolerate decreased mathematical precision in their computations …

被引用次数：39 相关文章所有 4 个版本

Machine learning feature based job scheduling for distributed machine learning clusters

H Wang, Z Liu, H Shen - IEEE/ACM Transactions on …, 2022 - ieeexplore.ieee.org

With the rapid proliferation of Machine Learning (ML) and Deep learning (DL) applications
running on modern platforms, it is crucial to satisfy application performance requirements …

被引用次数：10 相关文章所有 3 个版本

[PDF] arxiv.org

Morpheus: Extending the last level cache capacity in GPU systems using idle GPU core resources

S Darabi, M Sadrosadati, N Akbarzadeh… - 2022 55th IEEE/ACM …, 2022 - ieeexplore.ieee.org

Graphics Processing Units (GPUs) are widely-used accelerators for data-parallel
applications. In many GPU applications, GPU memory bandwidth bottlenecks performance …

被引用次数：10 相关文章所有 6 个版本

[PDF] mdpi.com Find it @ GRI

The advances, challenges and future possibilities of millimeter-wave chip-to-chip interconnections for multi-chip systems

A Ganguly, MM Ahmed, R Singh Narde… - Journal of Low Power …, 2018 - mdpi.com

With aggressive scaling of device geometries, density of manufacturing faults is expected to
increase. Therefore, yield of complex Multi-Processor Systems-on-Chips (MP-SoCs) will …

被引用次数：37 相关文章所有 5 个版本