Knightshift: Scaling the energy proportionality wall through server-level heterogeneity

U Gupta, S Hsia, V Saraph, X Wang… - 2020 ACM/IEEE 47th …, 2020 - ieeexplore.ieee.org

Neural personalized recommendation is the cornerstone of a wide collection of cloud
services and products, constituting significant compute demand of cloud infrastructure. Thus …

被引用次数：210 相关文章所有 12 个版本

[PDF] mit.edu

Tailbench: a benchmark suite and evaluation methodology for latency-critical applications

H Kasture, D Sanchez - 2016 IEEE International Symposium on …, 2016 - ieeexplore.ieee.org

Latency-critical applications, common in datacenters, must achieve small and predictable
tail (eg, 95th or 99th percentile) latencies. Their strict performance requirements limit …

被引用次数：234 相关文章所有 6 个版本

[PDF] mit.edu

Rubik: Fast analytical power management for latency-critical systems

H Kasture, DB Bartolini, N Beckmann… - Proceedings of the 48th …, 2015 - dl.acm.org

Latency-critical workloads (eg, web search), common in datacenters, require stable tail (eg,
95 th percentile) latencies of a few milliseconds. Servers running these workloads are kept …

被引用次数：198 相关文章所有 14 个版本

[PDF] upc.edu

Twig: Multi-agent task management for colocated latency-critical cloud services

R Nishtala, V Petrucci, P Carpenter… - … Symposium on High …, 2020 - ieeexplore.ieee.org

Many of the important services running on data centres are latency-critical, time-varying, and
demand strict user satisfaction. Stringent tail-latency targets for colocated services and …

被引用次数：96 相关文章所有 8 个版本

[PDF] ieee.org

Metrics for sustainable data centers

VD Reddy, B Setz, GSVRK Rao… - IEEE Transactions …, 2017 - ieeexplore.ieee.org

There are a multitude of metrics available to analyze individual key performance indicators
of data centers. In order to predict growth or set effective goals, it is important to choose the …

被引用次数：114 相关文章所有 3 个版本

[PDF] researchgate.net

Octopus-man: Qos-driven task management for heterogeneous multicores in warehouse-scale computers

V Petrucci, MA Laurenzano, J Doherty… - 2015 IEEE 21st …, 2015 - ieeexplore.ieee.org

Heterogeneous multicore architectures have the potential to improve energy efficiency by
integrating power-efficient wimpy cores with high-performing brawny cores. However, it is an …

被引用次数：115 相关文章所有 9 个版本

[PDF] upc.edu

Hipster: Hybrid task manager for latency-critical cloud workloads

R Nishtala, P Carpenter, V Petrucci… - … Symposium on High …, 2017 - ieeexplore.ieee.org

In 2013, US data centers accounted for 2.2% of the country's total electricity consumption, a
figure that is projected to increase rapidly over the next decade. Many important workloads …

被引用次数：89 相关文章所有 5 个版本

[PDF] nsf.gov

μdpm: Dynamic power management for the microsecond era

CH Chou, LN Bhuyan, D Wong - 2019 IEEE International …, 2019 - ieeexplore.ieee.org

The complex, distributed nature of data centers have spawned the adoption of distributed,
multi-tiered software architectures, consisting of many inter-connected microservices. These …

被引用次数：69 相关文章所有 3 个版本

[PDF] acm.org

Breaking the boundaries in heterogeneous-ISA datacenters

A Barbalace, R Lyerly, C Jelesnianski… - ACM SIGARCH …, 2017 - dl.acm.org

Energy efficiency is one of the most important design considerations in running modern
datacenters. Datacenter operating systems rely on software techniques such as execution …

被引用次数：86 相关文章所有 10 个版本

[PDF] nsf.gov

KRISP: Enabling kernel-wise right-sizing for spatial partitioned gpu inference servers

M Chow, A Jahanshahi, D Wong - 2023 IEEE International …, 2023 - ieeexplore.ieee.org

Machine learning (ML) inference workloads present significantly different challenges than
ML training workloads. Typically, inference workloads are shorter running and under-utilize …

被引用次数：19 相关文章所有 5 个版本