A survey on optimized implementation of deep learning models on the nvidia jetson platform
S Mittal - Journal of Systems Architecture, 2019 - Elsevier
Abstract Design of hardware accelerators for neural network (NN) applications involves
walking a tight rope amidst the constraints of low-power, high accuracy and throughput …
walking a tight rope amidst the constraints of low-power, high accuracy and throughput …
Retrospective on VLSI value scaling and lithography
ML Rieger - Journal of Micro/Nanolithography, MEMS, and …, 2019 - spiedigitallibrary.org
In recent decades, the rate of shrinking integrated-circuit components has slowed as
challenges accumulate. Yet, in part by virtue of an accelerating rate of cleverness, the end …
challenges accumulate. Yet, in part by virtue of an accelerating rate of cleverness, the end …
A survey of deep learning on cpus: opportunities and co-optimizations
CPU is a powerful, pervasive, and indispensable platform for running deep learning (DL)
workloads in systems ranging from mobile to extreme-end servers. In this article, we present …
workloads in systems ranging from mobile to extreme-end servers. In this article, we present …
Quarantine: Mitigating transient execution attacks with physical domain isolation
M Hertogh, M Wiesinger, S Österlund… - Proceedings of the 26th …, 2023 - dl.acm.org
Since the Spectre and Meltdown disclosure in 2018, the list of new transient execution
vulnerabilities that abuse the shared nature of microarchitectural resources on CPU cores …
vulnerabilities that abuse the shared nature of microarchitectural resources on CPU cores …
Energy-efficient multicore scheduling for hard real-time systems: A survey
As real-time embedded systems are evolving in scale and complexity, the demand for a
higher performance at a minimum energy consumption has become a necessity …
higher performance at a minimum energy consumption has become a necessity …
ReGraph: Scaling graph processing on HBM-enabled FPGAs with heterogeneous pipelines
The use of FPGAs for efficient graph processing has attracted significant interest. Recent
memory subsystem upgrades including the introduction of HBM in FPGAs promise to further …
memory subsystem upgrades including the introduction of HBM in FPGAs promise to further …
Network coding in heterogeneous multicore IoT nodes with DAG scheduling of parallel matrix block operations
Random linear network coding (RLNC) has the potential to improve the performance of
current and future Internet of Things (IoT) communication systems, but is computationally …
current and future Internet of Things (IoT) communication systems, but is computationally …
Towards energy-efficient heterogeneous multicore architectures for edge computing
A Gamatie, G Devic, G Sassatelli, S Bernabovi… - IEEE …, 2019 - ieeexplore.ieee.org
In recent years, the edge computing paradigm has been attracting much attention in the
Internet-of-Things domain. It aims to push the frontier of computing applications, data, and …
Internet-of-Things domain. It aims to push the frontier of computing applications, data, and …
It's time to think about an operating system for near data processing architectures
Near Data Processing, in form of processing in-memory (PIM) and in-storage computing
(ISC) was a very active area of research in computer architectures in the'90s [1, 25]. About 3 …
(ISC) was a very active area of research in computer architectures in the'90s [1, 25]. About 3 …
Collaborative heterogeneity-aware os scheduler for asymmetric multicore processors
Asymmetric multicore processors (AMP) offer multiple types of cores under the same
programming interface. Extracting the full potential of AMPs requires intelligent scheduling …
programming interface. Extracting the full potential of AMPs requires intelligent scheduling …