Effisha: A software framework for enabling effficient preemptive scheduling of gpu

N Capodieci, R Cavicchioli, M Bertogna… - 2018 IEEE Real …, 2018 - ieeexplore.ieee.org

Modern automotive-grade embedded computing platforms feature high-performance
Graphics Processing Units (GPUs) to support the massively parallel processing power …

被引用次数：95 相关文章所有 9 个版本

[PDF] arxiv.org

Graft: Efficient inference serving for hybrid deep learning with SLO guarantees via DNN re-alignment

J Wu, L Wang, Q Jin, F Liu - IEEE Transactions on Parallel and …, 2023 - ieeexplore.ieee.org

Deep neural networks (DNNs) have been widely adopted for various mobile inference tasks,
yet their ever-increasing computational demands are hindering their deployment on …

被引用次数：11 相关文章所有 6 个版本

[PDF] acm.org

Flep: Enabling flexible and efficient preemption on gpus

B Wu, X Liu, X Zhou, C Jiang - ACM SIGPLAN Notices, 2017 - dl.acm.org

GPUs are widely adopted in HPC and cloud computing platforms to accelerate general-
purpose workloads. However, modern GPUs do not support flexible preemption, leading to …

被引用次数：90 相关文章所有 4 个版本

The hpc-dag task model for heterogeneous real-time systems

Z Houssam-Eddine, N Capodieci… - IEEE Transactions …, 2020 - ieeexplore.ieee.org

Recent commercial hardware platforms for embedded real-time systems feature
heterogeneous processing units and computing accelerators on the same System-on-Chip …

被引用次数：49 相关文章所有 6 个版本

[PDF] acm.org

Secure and timely gpu execution in cyber-physical systems

J Wang, Y Wang, N Zhang - Proceedings of the 2023 ACM SIGSAC …, 2023 - dl.acm.org

Graphics Processing Units (GPU) are increasingly deployed on Cyber-physical Systems
(CPSs), frequently used to perform real-time safety-critical functions, such as object …

被引用次数：7 相关文章所有 3 个版本

[PDF] researchgate.net

Adaptive optimization for OpenCL programs on embedded heterogeneous systems

B Taylor, VS Marco, Z Wang - ACM SIGPLAN Notices, 2017 - dl.acm.org

Heterogeneous multi-core architectures consisting of CPUs and GPUs are commonplace in
today's embedded systems. These architectures offer potential for energy efficient computing …

被引用次数：71 相关文章所有 8 个版本

[PDF] otago.ac.nz

Astraea: towards QoS-aware and resource-efficient multi-stage GPU services

W Zhang, Q Chen, K Fu, N Zheng, Z Huang… - Proceedings of the 27th …, 2022 - dl.acm.org

Multi-stage user-facing applications on GPUs are widely-used nowa-days, and are often
implemented to be microservices. Prior re-search works are not applicable to ensuring QoS …

被引用次数：23 相关文章所有 3 个版本

[PDF] arxiv.org

Miriam: Exploiting elastic kernels for real-time multi-dnn inference on edge gpu

Z Zhao, N Ling, N Guan, G Xing - … of the 21st ACM Conference on …, 2023 - dl.acm.org

Many applications such as autonomous driving and augmented reality, require the
concurrent running of multiple deep neural networks (DNN) that poses different levels of real …

被引用次数：11 相关文章所有 4 个版本

Kalmia: A heterogeneous QoS-aware scheduling framework for DNN tasks on edge servers

Z Fu, J Ren, D Zhang, Y Zhou… - IEEE INFOCOM 2022 …, 2022 - ieeexplore.ieee.org

Motivated by the popularity of edge intelligence, DNN services have been widely deployed
at the edge, posing significant performance pressure on edge servers. How to improve the …

被引用次数：13 相关文章所有 2 个版本

[PDF] academia.edu

NURA: A framework for supporting non-uniform resource accesses in GPUs

S Darabi, N Mahani, H Baxishi… - Proceedings of the …, 2022 - dl.acm.org

Multi-application execution in Graphics Processing Units (GPUs), a promising way to utilize
GPU resources, is still challenging. Some pieces of prior work (eg, spatial multitasking) have …

被引用次数：14 相关文章所有 5 个版本