Efficient scheduling policies for {Microsecond-Scale} tasks
Datacenter operators today strive to support microsecond-latency applications while also
using their limited CPU resources as efficiently as possible. To achieve this, several recent …
using their limited CPU resources as efficiently as possible. To achieve this, several recent …
Reducing Datacenter Compute Carbon Footprint by Harnessing the Power of Specialization: Principles, Metrics, Challenges and Opportunities
Computing is an indispensable tool in addressing climate change, but it also contributes to a
significant and steadily increasing carbon footprint, partly due to the exponential growth in …
significant and steadily increasing carbon footprint, partly due to the exponential growth in …
Syrup: User-defined scheduling across the stack
K Kaffes, JT Humphries, D Mazières… - Proceedings of the ACM …, 2021 - dl.acm.org
Suboptimal scheduling decisions in operating systems, networking stacks, and application
runtimes are often responsible for poor application performance, including higher latency …
runtimes are often responsible for poor application performance, including higher latency …
Achieving microsecond-scale tail latency efficiently with approximate optimal scheduling
Datacenter applications expect microsecond-scale service times and tightly bound tail
latency, with future workloads expected to be even more demanding. To address this …
latency, with future workloads expected to be even more demanding. To address this …
{BeeBox}: Hardening {BPF} against Transient Execution Attacks
The Berkeley Packet Filter (BPF) has emerged as the de-facto standard for carrying out safe
and performant, user-specified computation (s) in kernel space. However, BPF also …
and performant, user-specified computation (s) in kernel space. However, BPF also …
Making kernel bypass practical for the cloud with Junction
Kernel bypass systems have demonstrated order of magnitude improvements in throughput
and tail latency for network-intensive applications relative to traditional operating systems …
and tail latency for network-intensive applications relative to traditional operating systems …
{Application-Informed} Kernel Synchronization Primitives
Kernel synchronization primitives are the backbone of any OS design. Kernel locks, for
instance, are crucial for both application performance and correctness. However, unlike …
instance, are crucial for both application performance and correctness. However, unlike …
μSwitch: Fast Kernel Context Isolation with Implicit Context Switches
Isolating application components is crucial to limit the exposure of sensitive data and code to
vulnerabilities in the untrusted components. Process-based isolation is the de facto isolation …
vulnerabilities in the untrusted components. Process-based isolation is the de facto isolation …
{RingLeader}: efficiently Offloading {Intra-Server} Orchestration to {NICs}
Careful orchestration of requests at a datacenter server is crucial to meet tight tail latency
requirements and ensure high throughput and optimal CPU utilization. Orchestration is multi …
requirements and ensure high throughput and optimal CPU utilization. Orchestration is multi …
Fast core scheduling with userspace process abstraction
J Lin, Y Chen, S Gao, Y Lu - Proceedings of the ACM SIGOPS 30th …, 2024 - dl.acm.org
We introduce uProcess, a pure userspace process abstraction that enables CPU cores to be
rescheduled among applications at sub-microsecond timescale without trapping into the …
rescheduled among applications at sub-microsecond timescale without trapping into the …