{IX}: a protected dataplane operating system for high throughput and low latency

A Belay, G Prekas, A Klimovic, S Grossman… - … USENIX Symposium on …, 2014 - usenix.org
The conventional wisdom is that aggressive networking requirements, such as high packet
rates for small messages and microsecond-scale tail latency, are best addressed outside the …

High performance packet processing with flexnic

A Kaufmann, SI Peter, NK Sharma, T Anderson… - Proceedings of the …, 2016 - dl.acm.org
The recent surge of network I/O performance has put enormous pressure on memory and
software I/O processing sub systems. We argue that the primary reason for high memory and …

[PDF][PDF] What every programmer should know about memory

U Drepper - Red Hat, Inc, 2007 - timothya.com
As CPU cores become both faster and more numerous, the limiting factor for most programs
is now, and will be for some time, memory access. Hardware designers have come up with …

Thin servers with smart pipes: Designing soc accelerators for memcached

K Lim, D Meisner, AG Saidi, P Ranganathan… - ACM SIGARCH …, 2013 - dl.acm.org
Distributed in-memory key-value stores, such as memcached, are central to the scalability of
modern internet services. Current deployments use commodity servers with high-end …

Decoupled direct memory access: Isolating CPU and IO traffic by leveraging a dual-data-port DRAM

D Lee, L Subramanian… - 2015 International …, 2015 - ieeexplore.ieee.org
Memory channel contention is a critical performance bottleneck in modern systems that have
highly parallelized processing units operating on large data sets. The memory channel is …

Architecting to achieve a billion requests per second throughput on a single key-value store server platform

S Li, H Lim, VW Lee, JH Ahn, A Kalia… - Proceedings of the …, 2015 - dl.acm.org
Distributed in-memory key-value stores (KVSs), such as memcached, have become a critical
data serving layer in modern Internet-oriented datacenter infrastructure. Their performance …

NetCAT: Practical cache attacks from the network

M Kurth, B Gras, D Andriesse… - … IEEE Symposium on …, 2020 - ieeexplore.ieee.org
Increased peripheral performance is causing strain on the memory subsystem of modern
processors. For example, available DRAM throughput can no longer sustain the traffic of a …

Reexamining Direct Cache Access to Optimize {I/O} Intensive Applications for Multi-hundred-gigabit Networks

A Farshin, A Roozbeh, GQ Maguire Jr… - 2020 USENIX Annual …, 2020 - usenix.org
Memory access is the major bottleneck in realizing multi-hundred-gigabit networks with
commodity hardware, hence it is essential to make good use of cache memory that is a …

Deterministic memory hierarchy and virtualization for modern multi-core embedded systems

T Kloda, M Solieri, R Mancuso… - 2019 IEEE Real …, 2019 - ieeexplore.ieee.org
One of the main predictability bottlenecks of modern multi-core embedded systems is
contention for access to shared memory resources. Partitioning and software-driven …

More than capacity: Performance-oriented evolution of pangu in alibaba

Q Li, Q Xiang, Y Wang, H Song, R Wen, W Yao… - … USENIX Conference on …, 2023 - usenix.org
This paper presents how the Pangu storage system continuously evolves with hardware
technologies and the business model to provide high-performance, reliable storage services …