MELTing point: Mobile Evaluation of Language Transformers

S Laskaridis, K Kateveas, L Minto… - arXiv preprint arXiv …, 2024 - arxiv.org
Transformers have revolutionized the machine learning landscape, gradually making their
way into everyday tasks and equipping our computers with``sparks of intelligence'' …

CARIn: Constraint-Aware and Responsive Inference on Heterogeneous Devices for Single-and Multi-DNN Workloads

I Panopoulos, S Venieris, I Venieris - ACM Transactions on Embedded …, 2024 - dl.acm.org
The relentless expansion of deep learning (DL) applications in recent years has prompted a
pivotal shift towards on-device execution, driven by the urgent need for real-time processing …

Efficient Batched Inference in Conditional Neural Networks

S Selvam, A Nagarajan… - IEEE Transactions on …, 2024 - ieeexplore.ieee.org
Conditional Neural Networks are networks in which the computations performed vary based
on the input. Many neural networks (NNs) of interest (such as autoregressive transformers …

[HTML][HTML] Inference serving with end-to-end latency SLOs over dynamic edge networks

V Nigade, P Bauszat, H Bal, L Wang - Real-Time Systems, 2024 - Springer
While high accuracy is of paramount importance for deep learning (DL) inference, serving
inference requests on time is equally critical but has not been carefully studied especially …

Online Resource Provisioning and Batch Scheduling for AIoT Inference Serving in an XPU Edge Cloud

R Liu, Y Wu, K Zhao, Z Zhou, X Gao… - … on Emerging Topics …, 2024 - ieeexplore.ieee.org
Driven by the accelerated convergence of artificial intelligence (AI) and the Internet of Things
(IoT), the recent years have witnessed the booming of Artificial Intelligence of Things (AIoT) …

[PDF][PDF] VU Research Portal

VV Nigade - research.vu.nl
Deep learning (DL) technology has shown great promise in enhancing many facets of our
lives by improving the accuracy of everyday modern applications, including but not limited to …