A survey on accelerating technologies for fast network packet processing in Linux environments
The path a packet takes when handled by the Linux Kernel has been well established for a
long time. Its overhead/bottleneck issues are also known. Nonetheless, complexity has …
long time. Its overhead/bottleneck issues are also known. Nonetheless, complexity has …
High-throughput and Flexible Host Networking for Accelerated Computing
Modern network hardware is able to meet the stringent bandwidth demands of applications
like GPU-accelerated AI. However, existing host network stacks offer a hard tradeoff …
like GPU-accelerated AI. However, existing host network stacks offer a hard tradeoff …
Unleashing SmartNIC packet processing performance in P4
SmartNICs are on the rise as a packet processing platform, with the trend towards a uniform
P4 programming model. However, unleashing SmartNIC packet processing performance in …
P4 programming model. However, unleashing SmartNIC packet processing performance in …
Towards μs tail latency and terabit ethernet: disaggregating the host network stack
Dedicated, tightly integrated, and static packet processing pipelines in today's most widely
deployed network stacks preclude them from fully exploiting capabilities of modern …
deployed network stacks preclude them from fully exploiting capabilities of modern …
Beehive: A Flexible Network Stack for Direct-Attached Accelerators
Direct-attached accelerators, where application accelerators are directly connected to the
datacenter network via a hardware network stack, offer substantial benefits in terms of …
datacenter network via a hardware network stack, offer substantial benefits in terms of …
{RingLeader}: efficiently Offloading {Intra-Server} Orchestration to {NICs}
Careful orchestration of requests at a datacenter server is crucial to meet tight tail latency
requirements and ensure high throughput and optimal CPU utilization. Orchestration is multi …
requirements and ensure high throughput and optimal CPU utilization. Orchestration is multi …
Flagger: Cooperative acceleration for large-scale cross-silo federated learning aggregation
Cross-silo federated learning (FL) leverages homomorphic encryption (HE) to obscure the
model updates from the clients. However, HE poses the challenges of complex …
model updates from the clients. However, HE poses the challenges of complex …
LogNIC: A High-Level Performance Model for SmartNICs
SmartNICs have become an indispensable communication fabric and computing substrate
in today's data centers and enterprise clusters, providing in-network computing capabilities …
in today's data centers and enterprise clusters, providing in-network computing capabilities …
{STYX}: Exploiting {SmartNIC} Capability to Reduce Datacenter Memory Tax
Memory optimization kernel features, such as memory deduplication, are designed to
improve the overall efficiency of systems like datacenter servers, and they have proven to be …
improve the overall efficiency of systems like datacenter servers, and they have proven to be …
Deploying user-space {TCP} at cloud scale with {LUNA}
The TCP remains the workhorse protocol for many modern large-scale data centers.
However, the increasingly demanding performance expectations—led by advancement in …
However, the increasingly demanding performance expectations—led by advancement in …