Splitwise: Efficient generative llm inference using phase splitting
Generative large language model (LLM) applications are growing rapidly, leading to large-
scale deployments of expensive and power-hungry GPUs. Our characterization of LLM …
scale deployments of expensive and power-hungry GPUs. Our characterization of LLM …
A survey of techniques for architecting and managing asymmetric multicore processors
S Mittal - ACM Computing Surveys (CSUR), 2016 - dl.acm.org
To meet the needs of a diverse range of workloads, asymmetric multicore processors
(AMPs) have been proposed, which feature cores of different microarchitecture or ISAs …
(AMPs) have been proposed, which feature cores of different microarchitecture or ISAs …
Scheduling heterogeneous multi-cores through performance impact estimation (PIE)
Single-ISA heterogeneous multi-core processors are typically composed of small (eg, in-
order) power-efficient cores and big (eg, out-of-order) high-performance cores. The …
order) power-efficient cores and big (eg, out-of-order) high-performance cores. The …
Self-aware computing systems
This book is the first ever to focus on the emerging field of self-aware computing from an
engineering perspective. It first comprehensively introduces fundamentals for self …
engineering perspective. It first comprehensively introduces fundamentals for self …
Cooperative-competitive task allocation in edge computing for delay-sensitive social sensing
With the ever-increasing data processing capabilities of edge computing devices and the
growing acceptance of running social sensing applications on such cloud-edge systems …
growing acceptance of running social sensing applications on such cloud-edge systems …
Jigsaw: Scalable software-defined caches
N Beckmann, D Sanchez - Proceedings of the 22nd …, 2013 - ieeexplore.ieee.org
Shared last-level caches, widely used in chip-multi-processors (CMPs), face two
fundamental limitations. First, the latency and energy of shared caches degrade as the …
fundamental limitations. First, the latency and energy of shared caches degrade as the …
Fairness-aware scheduling on single-ISA heterogeneous multi-cores
Single-ISA heterogeneous multi-cores consisting of small (eg, in-order) and big (eg, out-of-
order) cores dramatically improve energy-and power-efficiency by scheduling workloads on …
order) cores dramatically improve energy-and power-efficiency by scheduling workloads on …
Inter-cluster thread-to-core mapping and DVFS on heterogeneous multi-cores
Heterogeneous multi-core platforms that contain different types of cores, organized as
clusters, are emerging, eg, ARM's big. LITTLE architecture. These platforms often need to …
clusters, are emerging, eg, ARM's big. LITTLE architecture. These platforms often need to …
Price theory based power management for heterogeneous multi-cores
T Somu Muthukaruppan, A Pathania, T Mitra - ACM SIGPLAN Notices, 2014 - dl.acm.org
Heterogeneous multi-cores that integrate cores with different power performance
characteristics are promising alternatives to homogeneous systems in energy-and thermally …
characteristics are promising alternatives to homogeneous systems in energy-and thermally …
[HTML][HTML] Energy-efficient virtual-machine mapping algorithm (EViMA) for workflow tasks with deadlines in a cloud environment
Processing large scientific applications generates a huge amount of data, which makes
running experiments in the cloud computing environment very expensive and energy …
running experiments in the cloud computing environment very expensive and energy …