Serverless computing: a survey of opportunities, challenges, and applications

H Shafiei, A Khonsari, P Mousavi - ACM Computing Surveys, 2022 - dl.acm.org
The emerging serverless computing paradigm has attracted attention from both academia
and industry. This paradigm brings benefits such as less operational complexity, a pay-as …

Wisefuse: Workload characterization and dag transformation for serverless workflows

A Mahgoub, EB Yi, K Shankar, E Minocha… - Proceedings of the …, 2022 - dl.acm.org
We characterize production workloads of serverless DAGs at a major cloud provider. Our
analysis highlights two major factors that limit performance:(a) lack of efficient …

The power of prediction: microservice auto scaling via workload learning

S Luo, H Xu, K Ye, G Xu, L Zhang, G Yang… - Proceedings of the 13th …, 2022 - dl.acm.org
When deploying microservices in production clusters, it is critical to automatically scale
containers to improve cluster utilization and ensure service level agreements (SLA) …

Vabus: Edge-cloud real-time video analytics via background understanding and subtraction

H Wang, Q Li, H Sun, Z Chen, Y Hao… - IEEE Journal on …, 2022 - ieeexplore.ieee.org
Edge-cloud collaborative video analytics is transforming the way data is being handled,
processed, and transmitted from the ever-growing number of surveillance cameras around …

Optimizing video analytics with declarative model relationships

F Romero, J Hauswald, A Partap, D Kang… - Proceedings of the …, 2022 - dl.acm.org
The availability of vast video collections and the accuracy of ML models has generated
significant interest in video analytics systems. Since naively processing all frames using …

Stepconf: Slo-aware dynamic resource configuration for serverless function workflows

Z Wen, Y Wang, F Liu - IEEE INFOCOM 2022-IEEE Conference …, 2022 - ieeexplore.ieee.org
Function-as-a-Service (FaaS) offers a fine-grained resource provision model, enabling
developers to build highly elastic cloud applications. User requests are handled by a series …

Honeycomb: Secure and Efficient {GPU} Executions via Static Validation

H Mai, J Zhao, H Zheng, Y Zhao, Z Liu, M Gao… - … USENIX Symposium on …, 2023 - usenix.org
Graphics Processing Units (GPUs) unlock emerging use cases like large language models
and autonomous driving. They process a large amount of sensitive data, where security is of …

Scrooge: A cost-effective deep learning inference system

Y Hu, R Ghosh, R Govindan - Proceedings of the ACM Symposium on …, 2021 - dl.acm.org
Advances in deep learning (DL) have prompted the development of cloud-hosted DL-based
media applications that process video and audio streams in real-time. Such applications …

Clover: Toward sustainable ai with carbon-aware machine learning inference service

B Li, S Samsi, V Gadepally, D Tiwari - Proceedings of the International …, 2023 - dl.acm.org
This paper presents a solution to the challenge of mitigating carbon emissions from hosting
large-scale machine learning (ML) inference services. ML inference is critical to modern …

FaST-GShare: Enabling efficient spatio-temporal GPU sharing in serverless computing for deep learning inference

J Gu, Y Zhu, P Wang, M Chadha, M Gerndt - Proceedings of the 52nd …, 2023 - dl.acm.org
Serverless computing (FaaS) has been extensively utilized for deep learning (DL) inference
due to the ease of deployment and pay-per-use benefits. However, existing FaaS platforms …