[HTML][HTML] A survey of cache bypassing techniques

S Mittal - Journal of Low Power Electronics and Applications, 2016 - mdpi.com
With increasing core-count, the cache demand of modern processors has also increased.
However, due to strict area/power budgets and presence of poor data-locality workloads …

A survey of architectural approaches for improving GPGPU performance, programmability and heterogeneity

M Khairy, AG Wassal, M Zahran - Journal of Parallel and Distributed …, 2019 - Elsevier
With the skyrocketing advances of process technology, the increased need to process huge
amount of data, and the pivotal need for power efficiency, the usage of Graphics Processing …

Securing gpu via region-based bounds checking

J Lee, Y Kim, J Cao, E Kim, J Lee, H Kim - Proceedings of the 49th …, 2022 - dl.acm.org
Graphics processing units (GPUs) have become essential general-purpose computing
platforms to accelerate a wide range of workloads, such as deep learning, scientific, and …

FineReg: Fine-grained register file management for augmenting GPU throughput

Y Oh, MK Yoon, WJ Song… - 2018 51st Annual IEEE …, 2018 - ieeexplore.ieee.org
Graphics processing units (GPUs) include a large amount of hardware resources for parallel
thread executions. However, the resources are not fully utilized during runtime, and …

G-Safe: Safe GPU Sharing in Multi-Tenant Environments

M Pavlidakis, G Vasiliadis, S Mavridis… - arXiv preprint arXiv …, 2024 - arxiv.org
Modern GPU applications, such as machine learning (ML) frameworks, can only partially
utilize beefy GPUs, leading to GPU underutilization in cloud environments. Sharing GPUs …

[HTML][HTML] Architectural techniques for improving the power consumption of noc-based cmps: A case study of cache and network layer

E Ofori-Attah, W Bhebhe… - Journal of Low Power …, 2017 - mdpi.com
The disparity between memory and CPU have been ameliorated by the introduction of
Network-on-Chip-based Chip-Multiprocessors (NoC-based CMPS). However, power …

Smqos: Improving utilization and energy efficiency with qos awareness on gpus

Q Sun, Y Liu, H Yang, Z Luan… - 2019 IEEE International …, 2019 - ieeexplore.ieee.org
Meeting the Quality of Service (QoS) requirement under task consolidation on the GPU is
extremely challenging. Previous work mostly relies on static task or resource scheduling and …

Adaptive cooperation of prefetching and warp scheduling on gpus

Y Oh, K Kim, MK Yoon, JH Park, Y Park… - IEEE Transactions …, 2018 - ieeexplore.ieee.org
This paper proposes a new architecture, called Adaptive PREfetching and Scheduling
(APRES), which improves cache efficiency of GPUs. APRES relies on the observation that …

WASP: Selective data prefetching with monitoring runtime warp progress on GPUs

Y Oh, MK Yoon, JH Park, Y Park… - IEEE Transactions on …, 2018 - ieeexplore.ieee.org
This paper proposes a new data prefetching technique for Graphics Processing Units
(GPUs) called Warp Aware Selective Prefetching (WASP). The main idea of WASP is to …

A distributed architecture and design challenges of an astray pilgrim tracking system

MAR Abdeen - 2018 IEEE 16th Intl Conf on Dependable …, 2018 - ieeexplore.ieee.org
In this paper we present a distributed architecture to address the problems of managing,
tracking, and predicting astray person in large crowds. Our case study considers the …