Paving the way for NFV acceleration: A taxonomy, survey and future directions

X Fei, F Liu, Q Zhang, H Jin, H Hu - ACM Computing Surveys (CSUR), 2020 - dl.acm.org
As a recent innovation, network functions virtualization (NFV)—with its core concept of
replacing hardware middleboxes with software network functions (NFs) implemented in …

Gslice: controlled spatial sharing of gpus for a scalable inference platform

A Dhakal, SG Kulkarni, KK Ramakrishnan - Proceedings of the 11th ACM …, 2020 - dl.acm.org
The increasing demand for cloud-based inference services requires the use of Graphics
Processing Unit (GPU). It is highly desirable to utilize GPU efficiently by multiplexing different …

E3:{Energy-Efficient} microservices on {SmartNIC-Accelerated} servers

M Liu, S Peter, A Krishnamurthy… - 2019 USENIX Annual …, 2019 - usenix.org
We investigate the use of SmartNIC-accelerated servers to execute microservice-based
applications in the data center. By offloading suitable microservices to the SmartNIC's low …

Transparent {GPU} sharing in container clouds for deep learning workloads

B Wu, Z Zhang, Z Bai, X Liu, X Jin - 20th USENIX Symposium on …, 2023 - usenix.org
Containers are widely used for resource management in datacenters. A common practice to
support deep learning (DL) training in container clouds is to statically bind GPUs to …

Heimdall: mobile GPU coordination platform for augmented reality applications

J Yi, Y Lee - Proceedings of the 26th Annual International …, 2020 - dl.acm.org
We present Heimdall, a mobile GPU coordination platform for emerging Augmented Reality
(AR) applications. Future AR apps impose an explored challenging workload: i) concurrent …

{NICA}: An infrastructure for inline acceleration of network applications

H Eran, L Zeno, M Tork, G Malka… - 2019 USENIX Annual …, 2019 - usenix.org
With rising network rates, cloud vendors increasingly deploy FPGA-based SmartNICs (F-
NICs), leveraging their inline processing capabilities to offload hypervisor networking …

Salus: Fine-grained gpu sharing primitives for deep learning applications

P Yu, M Chowdhury - arXiv preprint arXiv:1902.04610, 2019 - arxiv.org
GPU computing is becoming increasingly more popular with the proliferation of deep
learning (DL) applications. However, unlike traditional resources such as CPU or the …

PacketMill: toward per-Core 100-Gbps networking

A Farshin, T Barbette, A Roozbeh… - Proceedings of the 26th …, 2021 - dl.acm.org
We present PacketMill, a system for optimizing software packet processing, which (i)
introduces a new model to efficiently manage packet metadata and (ii) employs code …

Fine-grained GPU sharing primitives for deep learning applications

P Yu, M Chowdhury - Proceedings of Machine Learning and …, 2020 - proceedings.mlsys.org
Unlike traditional resources such as CPU or the network, modern GPUs do not natively
support fine-grained sharing primitives. Consequently, implementing common policies such …

P4SC: Towards high-performance service function chain implementation on the P4-capable device

X Chen, D Zhang, X Wang, K Zhu… - 2019 IFIP/IEEE …, 2019 - ieeexplore.ieee.org
Most Service Function Chains (SFCs) in Network Function Virtualization (NFV) are realized
on the software or offloading to the network interface card (NIC) and FPGA. However, the …