Performance interference of virtual machines: A survey

W Lin, C Xiong, W Wu, F Shi, K Li, M Xu - ACM Computing Surveys, 2023 - dl.acm.org
The rapid development of cloud computing with virtualization technology has benefited both
academia and industry. For any cloud data center at scale, one of the primary challenges is …

{INFaaS}: Automated model-less inference serving

F Romero, Q Li, NJ Yadwadkar… - 2021 USENIX Annual …, 2021 - usenix.org
Despite existing work in machine learning inference serving, ease-of-use and cost efficiency
remain challenges at large scales. Developers must manually search through thousands of …

Twig: Multi-agent task management for colocated latency-critical cloud services

R Nishtala, V Petrucci, P Carpenter… - … Symposium on High …, 2020 - ieeexplore.ieee.org
Many of the important services running on data centres are latency-critical, time-varying, and
demand strict user satisfaction. Stringent tail-latency targets for colocated services and …

Warehouse-scale video acceleration: co-design and deployment in the wild

P Ranganathan, D Stodolsky, J Calow… - Proceedings of the 26th …, 2021 - dl.acm.org
Video sharing (eg, YouTube, Vimeo, Facebook, TikTok) accounts for the majority of internet
traffic, and video processing is also foundational to several other key workloads (video …

Interference-aware scheduling for inference serving

D Mendoza, F Romero, Q Li, NJ Yadwadkar… - Proceedings of the 1st …, 2021 - dl.acm.org
Machine learning inference applications have proliferated through diverse domains such as
healthcare, security, and analytics. Recent work has proposed inference serving systems for …

CuttleSys: Data-driven resource management for interactive services on reconfigurable multicores

N Kulkarni, G Gonzalez-Pumariega… - 2020 53rd Annual …, 2020 - ieeexplore.ieee.org
Multi-tenancy for latency-critical applications leads to resource interference and
unpredictable performance. Core reconfiguration opens up more opportunities for …

INFaaS: A model-less and managed inference serving system

F Romero, Q Li, NJ Yadwadkar, C Kozyrakis - arXiv preprint arXiv …, 2019 - arxiv.org
Despite existing work in machine learning inference serving, ease-of-use and cost efficiency
remain challenges at large scales. Developers must manually search through thousands of …

Workload consolidation in alibaba clusters: the good, the bad, and the ugly

Y Zhang, Y Yu, W Wang, Q Chen, J Wu… - Proceedings of the 13th …, 2022 - dl.acm.org
Web companies typically run latency-critical long-running services and resource-intensive,
throughput-hungry batch jobs in a shared cluster for improved utilization and reduced cost …

Adaptive performance modeling of data-intensive workloads for resource provisioning in virtualized environment

HM Makrani, H Sayadi, N Nazari… - ACM Transactions on …, 2021 - dl.acm.org
The processing of data-intensive workloads is a challenging and time-consuming task that
often requires massive infrastructure to ensure fast data analysis. The cloud platform is the …

RESTRAIN: A dynamic and cost-efficient resource management scheme for addressing performance interference in NFV-based systems

VR Chintapalli, M Adeppady, BR Tamma - Journal of Network and …, 2022 - Elsevier
Abstract Network Functions Virtualization (NFV) replaces the conventional middleboxes by
their software counterparts known as Virtual Network Functions (VNFs) which run on general …