Deadline-based scheduling for GPU with preemption support
Modern automotive-grade embedded computing platforms feature high-performance
Graphics Processing Units (GPUs) to support the massively parallel processing power …
Graphics Processing Units (GPUs) to support the massively parallel processing power …
Graft: Efficient inference serving for hybrid deep learning with SLO guarantees via DNN re-alignment
Deep neural networks (DNNs) have been widely adopted for various mobile inference tasks,
yet their ever-increasing computational demands are hindering their deployment on …
yet their ever-increasing computational demands are hindering their deployment on …
Flep: Enabling flexible and efficient preemption on gpus
GPUs are widely adopted in HPC and cloud computing platforms to accelerate general-
purpose workloads. However, modern GPUs do not support flexible preemption, leading to …
purpose workloads. However, modern GPUs do not support flexible preemption, leading to …
The hpc-dag task model for heterogeneous real-time systems
Z Houssam-Eddine, N Capodieci… - IEEE Transactions …, 2020 - ieeexplore.ieee.org
Recent commercial hardware platforms for embedded real-time systems feature
heterogeneous processing units and computing accelerators on the same System-on-Chip …
heterogeneous processing units and computing accelerators on the same System-on-Chip …
Secure and timely gpu execution in cyber-physical systems
Graphics Processing Units (GPU) are increasingly deployed on Cyber-physical Systems
(CPSs), frequently used to perform real-time safety-critical functions, such as object …
(CPSs), frequently used to perform real-time safety-critical functions, such as object …
Adaptive optimization for OpenCL programs on embedded heterogeneous systems
Heterogeneous multi-core architectures consisting of CPUs and GPUs are commonplace in
today's embedded systems. These architectures offer potential for energy efficient computing …
today's embedded systems. These architectures offer potential for energy efficient computing …
Astraea: towards QoS-aware and resource-efficient multi-stage GPU services
Multi-stage user-facing applications on GPUs are widely-used nowa-days, and are often
implemented to be microservices. Prior re-search works are not applicable to ensuring QoS …
implemented to be microservices. Prior re-search works are not applicable to ensuring QoS …
Miriam: Exploiting elastic kernels for real-time multi-dnn inference on edge gpu
Many applications such as autonomous driving and augmented reality, require the
concurrent running of multiple deep neural networks (DNN) that poses different levels of real …
concurrent running of multiple deep neural networks (DNN) that poses different levels of real …
Kalmia: A heterogeneous QoS-aware scheduling framework for DNN tasks on edge servers
Motivated by the popularity of edge intelligence, DNN services have been widely deployed
at the edge, posing significant performance pressure on edge servers. How to improve the …
at the edge, posing significant performance pressure on edge servers. How to improve the …
NURA: A framework for supporting non-uniform resource accesses in GPUs
Multi-application execution in Graphics Processing Units (GPUs), a promising way to utilize
GPU resources, is still challenging. Some pieces of prior work (eg, spatial multitasking) have …
GPU resources, is still challenging. Some pieces of prior work (eg, spatial multitasking) have …