Fast core scheduling with userspace process abstraction
J Lin, Y Chen, S Gao, Y Lu - Proceedings of the ACM SIGOPS 30th …, 2024 - dl.acm.org
We introduce uProcess, a pure userspace process abstraction that enables CPU cores to be
rescheduled among applications at sub-microsecond timescale without trapping into the …
rescheduled among applications at sub-microsecond timescale without trapping into the …
Is Machine Learning Necessary for Cloud Resource Usage Forecasting?
Robust forecasts of future resource usage in cloud computing environments enable high
efficiency in resource management solutions, such as autoscaling and overcommitment …
efficiency in resource management solutions, such as autoscaling and overcommitment …
Lifting the fog of uncertainties: Dynamic resource orchestration for the containerized cloud
The advances in virtualization technologies have sparked a growing transition from virtual
machine (VM)-based to container-based infrastructure for cloud computing. From the …
machine (VM)-based to container-based infrastructure for cloud computing. From the …
ComboFunc: Joint Resource Combination and Container Placement for Serverless Function Scaling with Heterogeneous Container
Serverless computing provides developers with a maintenance-free approach to resource
usage, but it also transfers resource management responsibility to the cloud platform …
usage, but it also transfers resource management responsibility to the cloud platform …
DLRover-RM: Resource Optimization for Deep Recommendation Models Training in the Cloud
Deep learning recommendation models (DLRM) rely on large embedding tables to manage
categorical sparse features. Expanding such embedding tables can significantly enhance …
categorical sparse features. Expanding such embedding tables can significantly enhance …
Missile: Fine-Grained, Hardware-Level GPU Resource Isolation for Multi-Tenant DNN Inference
Colocating high-priority, latency-sensitive (LS) and low-priority, best-effort (BE) DNN
inference services reduces the total cost of ownership (TCO) of GPU clusters. Limited by …
inference services reduces the total cost of ownership (TCO) of GPU clusters. Limited by …
IOGuard: Software-Based I/O Page Fault Handling with One CPU Core
Y Dong, Z Mi - Proceedings of the 15th Asia-Pacific Symposium on …, 2024 - dl.acm.org
Nowadays, device passthrough I/O virtualization technology has played an essential role in
cloud scenarios like network connection. However, the absence of widespread support for …
cloud scenarios like network connection. However, the absence of widespread support for …
APP: Enabling soft real-time execution on densely-populated hybrid memory system
Memory swapping was considered slow and evil, but swapping to Ultra Low-Latency
storage like Optane has become a promising solution to save power and cost, helping …
storage like Optane has become a promising solution to save power and cost, helping …
Do Predictors for Resource Overcommitment Even Predict?
G Christofidi, TD Doudali - Proceedings of the 4th Workshop on Machine …, 2024 - dl.acm.org
Resource overcommitment allows datacenters to improve resource efficiency. In this
approach, the system allocates to the users the amount of resources to be most likely used …
approach, the system allocates to the users the amount of resources to be most likely used …
Harpagon: Minimizing DNN Serving Cost via Efficient Dispatching, Scheduling and Splitting
Advances in deep neural networks (DNNs) have significantly contributed to the development
of real-time video processing applications. Efficient scheduling of DNN workloads in cloud …
of real-time video processing applications. Efficient scheduling of DNN workloads in cloud …