A survey on accelerating technologies for fast network packet processing in Linux environments

E Freitas, AT de Oliveira Filho, PRX do Carmo… - Computer …, 2022 - Elsevier
The path a packet takes when handled by the Linux Kernel has been well established for a
long time. Its overhead/bottleneck issues are also known. Nonetheless, complexity has …

High-throughput and Flexible Host Networking for Accelerated Computing

A Skiadopoulos, Z Xie, M Zhao, Q Cai… - … USENIX Symposium on …, 2024 - usenix.org
Modern network hardware is able to meet the stringent bandwidth demands of applications
like GPU-accelerated AI. However, existing host network stacks offer a hard tradeoff …

Unleashing SmartNIC packet processing performance in P4

J Xing, Y Qiu, KF Hsu, S Sui, K Manaa… - Proceedings of the …, 2023 - dl.acm.org
SmartNICs are on the rise as a packet processing platform, with the trend towards a uniform
P4 programming model. However, unleashing SmartNIC packet processing performance in …

Towards μs tail latency and terabit ethernet: disaggregating the host network stack

Q Cai, M Vuppalapati, J Hwang, C Kozyrakis… - Proceedings of the …, 2022 - dl.acm.org
Dedicated, tightly integrated, and static packet processing pipelines in today's most widely
deployed network stacks preclude them from fully exploiting capabilities of modern …

Beehive: A Flexible Network Stack for Direct-Attached Accelerators

K Lim, M Giordano, T Stavrinos, I Zhang… - 2024 57th IEEE/ACM …, 2024 - ieeexplore.ieee.org
Direct-attached accelerators, where application accelerators are directly connected to the
datacenter network via a hardware network stack, offer substantial benefits in terms of …

{RingLeader}: efficiently Offloading {Intra-Server} Orchestration to {NICs}

J Lin, A Cardoza, T Khan, Y Ro, BE Stephens… - … USENIX Symposium on …, 2023 - usenix.org
Careful orchestration of requests at a datacenter server is crucial to meet tight tail latency
requirements and ensure high throughput and optimal CPU utilization. Orchestration is multi …

Flagger: Cooperative acceleration for large-scale cross-silo federated learning aggregation

X Pan, Y An, S Liang, B Mao, M Zhang… - 2024 ACM/IEEE 51st …, 2024 - ieeexplore.ieee.org
Cross-silo federated learning (FL) leverages homomorphic encryption (HE) to obscure the
model updates from the clients. However, HE poses the challenges of complex …

LogNIC: A High-Level Performance Model for SmartNICs

Z Guo, J Lin, Y Bai, D Kim, M Swift, A Akella… - Proceedings of the 56th …, 2023 - dl.acm.org
SmartNICs have become an indispensable communication fabric and computing substrate
in today's data centers and enterprise clusters, providing in-network computing capabilities …

{STYX}: Exploiting {SmartNIC} Capability to Reduce Datacenter Memory Tax

H Ji, M Mansi, Y Sun, Y Yuan, J Huang… - 2023 USENIX Annual …, 2023 - usenix.org
Memory optimization kernel features, such as memory deduplication, are designed to
improve the overall efficiency of systems like datacenter servers, and they have proven to be …

Deploying user-space {TCP} at cloud scale with {LUNA}

L Zhu, Y Shen, E Xu, B Shi, T Fu, S Ma… - 2023 USENIX Annual …, 2023 - usenix.org
The TCP remains the workhorse protocol for many modern large-scale data centers.
However, the increasingly demanding performance expectations—led by advancement in …