PreVIous: A methodology for prediction of visual inference performance on IoT devices

M Fernández-Sanjurjo, M Mucientes… - IEEE Internet of Things …, 2021 - ieeexplore.ieee.org

Real-time visual object tracking provides every object of interest with a unique identity and a
trajectory across video frames. This is a fundamental task of many video analytics …

被引用次数：27 相关文章所有 4 个版本

SLAPP: Subgraph-level attention-based performance prediction for deep learning models

Z Wang, P Yang, L Hu, B Zhang, C Lin, W Lv, Q Wang - Neural Networks, 2024 - Elsevier

The intricacy of the Deep Learning (DL) landscape, brimming with a variety of models,
applications, and platforms, poses considerable challenges for the optimal design …

被引用次数：2 相关文章所有 4 个版本

[PDF] mdpi.com

AI-driven performance modeling for AI inference workloads

M Sponner, B Waschneck, A Kumar - Electronics, 2022 - mdpi.com

Deep Learning (DL) is moving towards deploying workloads not only in cloud datacenters,
but also to the local devices. Although these are mostly limited to inference tasks, it still …

被引用次数：6 相关文章所有 8 个版本

[PDF] academia.edu

Performance modeling of computer vision-based cnn on edge gpus

H Bouzidi, H Ouarnoughi, S Niar… - ACM Transactions on …, 2022 - dl.acm.org

Convolutional Neural Networks (CNNs) are currently widely used in various fields,
particularly for computer vision applications. Edge platforms have drawn tremendous …

被引用次数：10 相关文章所有 3 个版本

[PDF] ieee.org

Blackthorn: latency estimation framework for CNNs on embedded Nvidia platforms

M Lechner, A Jantsch - IEEE Access, 2021 - ieeexplore.ieee.org

With more powerful yet efficient embedded devices and accelerators being available for
Deep Neural Networks (DNN), machine learning is becoming an integral part of edge …

被引用次数：14 相关文章所有 4 个版本

[PDF] arxiv.org

Automatic generation of fast and accurate performance models for deep neural network accelerators

K Lübeck, ALF Jung, F Wedlich, MM Müller… - arXiv preprint arXiv …, 2024 - arxiv.org

Implementing Deep Neural Networks (DNNs) on resource-constrained edge devices is a
challenging task that requires tailored hardware accelerator architectures and a clear …

被引用次数：1 相关文章所有 2 个版本

Flexi-BOPI: Flexible Granularity Pipeline Inference with Bayesian Optimization for Deep Learning Models on HMPSoC

Z Wang, P Yang, B Zhang, L Hu, W Lv, C Lin… - Information Sciences, 2024 - Elsevier

To achieve high-throughput deep learning (DL) model inference on heterogeneous
multiprocessor systems-on-chip (HMPSoC) platforms, the use of pipelining for the …

Performance Prediction for Deep Learning Models with Pipeline Inference Strategy

Z Wang, P Yang, B Zhang, L Hu, W Lv… - IEEE Internet of …, 2023 - ieeexplore.ieee.org

For heterogeneous multiprocessor system-on-chips (HMPSoCs), a reasonable pipeline
design can significantly improve the inference performance of deep learning (DL) models …

被引用次数：1 相关文章

[PDF] ieee.org

Annette: Accurate neural network execution time estimation with stacked models

M Wess, M Ivanov, C Unger, A Nookala, A Wendt… - IEEE …, 2020 - ieeexplore.ieee.org

With new accelerator hardware for Deep Neural Networks (DNNs), the computing power for
Artificial Intelligence (AI) applications has increased rapidly. However, as DNN algorithms …

被引用次数：14 相关文章所有 10 个版本

Accurate estimation of the cnn inference cost for tinyml devices

T Garbay, K Hachicha, P Dobias… - 2022 IEEE 35th …, 2022 - ieeexplore.ieee.org

Our society will be deeply impacted by neural network inference on embedded devices.
Many of them are based on the use of microcontroller units (MCUs) which are extremely …

被引用次数：8 相关文章所有 2 个版本