Cocktailer: Analyzing and Optimizing Dynamic Control Flow in Deep Learning

文章

学术资源搜索

获得 3 条结果（用时0.01秒）

我的图书馆

Cocktailer: Analyzing and Optimizing Dynamic Control Flow in Deep Learning

在引用文章中搜索

[PDF] arxiv.org

Apparate: Rethinking Early Exits to Tame Latency-Throughput Tensions in ML Serving

Y Dai, R Pan, A Iyer, K Li, R Netravali - arXiv preprint arXiv:2312.05385, 2023 - arxiv.org

Machine learning (ML) inference platforms are tasked with balancing two competing goals:
ensuring high throughput given many requests, and delivering low-latency responses to …

被引用次数：1 相关文章所有 2 个版本

[PDF] arxiv.org

Opara: Exploiting Operator Parallelism for Expediting DNN Inference on GPUs

A Chen, F Xu, L Han, Y Dong, L Chen, Z Zhou… - arXiv preprint arXiv …, 2023 - arxiv.org

GPUs have become the defacto hardware devices to accelerate Deep Neural Network
(DNN) inference in deep learning (DL) frameworks. However, the conventional sequential …

被引用次数：2 相关文章所有 2 个版本

[PDF] cornell.edu

[PDF][PDF] Cascade: A Platform for Delay-Sensitive Edge Intelligence

W Song, T Garrett, Y Yang, M Liu, E Tremel… - arXiv preprint arXiv …, 2023 - cs.cornell.edu

Interest in intelligent edge computing is surging, driven by improving connectivity and
hardware advances. This is creating a need: today's cloud platforms optimize for high …

被引用次数：2 相关文章所有 3 个版本