Datacenter {RPCs} can be general and fast

A Kalia, M Kaminsky, D Andersen - 16th USENIX Symposium on …, 2019 - usenix.org
It is commonly believed that datacenter networking software must sacrifice generality to
attain high performance. The popularity of specialized distributed systems designed …

Offloading distributed applications onto smartnics using ipipe

M Liu, T Cui, H Schuh, A Krishnamurthy… - Proceedings of the …, 2019 - dl.acm.org
Emerging Multicore SoC SmartNICs, enclosing rich computing resources (eg, a multicore
processor, onboard DRAM, accelerators, programmable DMA engines), hold the potential to …

RDMA over commodity ethernet at scale

C Guo, H Wu, Z Deng, G Soni, J Ye, J Padhye… - Proceedings of the …, 2016 - dl.acm.org
Over the past one and half years, we have been using RDMA over commodity Ethernet
(RoCEv2) to support some of Microsoft's highly-reliable, latency-sensitive services. This …

The demikernel datapath os architecture for microsecond-scale datacenter systems

I Zhang, A Raybuck, P Patel, K Olynyk… - Proceedings of the …, 2021 - dl.acm.org
Datacenter systems and I/O devices now run at single-digit microsecond latencies, requiring
ns-scale operating systems. Traditional kernel-based operating systems impose an …

{FaSST}: Fast, Scalable and Simple Distributed Transactions with {Two-Sided}({{{{{RDMA}}}}}) Datagram {RPCs}

A Kalia, M Kaminsky, DG Andersen - 12th USENIX Symposium on …, 2016 - usenix.org
FaSST is an RDMA-based system that provides distributed in-memory transactions with
serializability and durability. Existing RDMA-based transaction processing systems use one …

{NetChain}:{Scale-Free}{Sub-RTT} coordination

X Jin, X Li, H Zhang, N Foster, J Lee, R Soulé… - … USENIX Symposium on …, 2018 - usenix.org
Coordination services are a fundamental building block of modern cloud systems, providing
critical functionalities like configuration management and distributed locking. The major …

Design guidelines for high performance {RDMA} systems

A Kalia, M Kaminsky, DG Andersen - 2016 USENIX Annual Technical …, 2016 - usenix.org
Modern RDMA hardware offers the potential for exceptional performance, but design
choices including which RDMA operations to use and how to use them significantly affect …

Zygos: Achieving low tail latency for microsecond-scale networked tasks

G Prekas, M Kogias, E Bugnion - Proceedings of the 26th Symposium on …, 2017 - dl.acm.org
This paper focuses on the efficient scheduling on multicore systems of very fine-grain
networked tasks, which are the typical building block of online data-intensive applications …

Endurable transient inconsistency in {Byte-Addressable} persistent {B+-Tree}

D Hwang, WH Kim, Y Won, B Nam - 16th USENIX Conference on File …, 2018 - usenix.org
With the emergence of byte-addressable persistent memory (PM), a cache line, instead of a
page, is expected to be the unit of data transfer between volatile and nonvolatile devices, but …

Sherman: A write-optimized distributed b+ tree index on disaggregated memory

Q Wang, Y Lu, J Shu - Proceedings of the 2022 international conference …, 2022 - dl.acm.org
Memory disaggregation architecture physically separates CPU and memory into
independent components, which are connected via high-speed RDMA networks, greatly …