A survey of methods for analyzing and improving GPU energy efficiency
Recent years have witnessed phenomenal growth in the computational capabilities and
applications of GPUs. However, this trend has also led to a dramatic increase in their power …
applications of GPUs. However, this trend has also led to a dramatic increase in their power …
Cloud computing landscape and research challenges regarding trust and reputation
Cloud Computing is an emerging computing paradigm. It shares massively scalable, elastic
resources (eg, data, calculations, and services) transparently among the users over a …
resources (eg, data, calculations, and services) transparently among the users over a …
Deeprecsys: A system for optimizing end-to-end at-scale neural recommendation inference
Neural personalized recommendation is the cornerstone of a wide collection of cloud
services and products, constituting significant compute demand of cloud infrastructure. Thus …
services and products, constituting significant compute demand of cloud infrastructure. Thus …
Scheduling techniques for GPU architectures with processing-in-memory capabilities
Processing data in or near memory (PIM), as opposed to in conventional computational units
in a processor, can greatly alleviate the performance and energy penalties of data transfers …
in a processor, can greatly alleviate the performance and energy penalties of data transfers …
Prophet: Precise qos prediction on non-preemptive accelerators to improve utilization in warehouse-scale computers
Guaranteeing Quality-of-Service (QoS) of latency-sensitive applications while improving
server utilization through application co-location is important yet challenging in modern …
server utilization through application co-location is important yet challenging in modern …
Adaptive cache management for energy-efficient GPU computing
With the SIMT execution model, GPUs can hide memory latency through massive
multithreading for many applications that have regular memory access patterns. To support …
multithreading for many applications that have regular memory access patterns. To support …
Coordinated static and dynamic cache bypassing for GPUs
The massive parallel architecture enables graphics processing units (GPUs) to boost
performance for a wide range of applications. Initially, GPUs only employ scratchpad …
performance for a wide range of applications. Initially, GPUs only employ scratchpad …
Flexible software profiling of gpu architectures
M Stephenson, SK Sastry Hari, Y Lee… - Proceedings of the …, 2015 - dl.acm.org
To aid application characterization and architecture design space exploration, researchers
and engineers have developed a wide range of tools for CPUs, including simulators …
and engineers have developed a wide range of tools for CPUs, including simulators …
Anatomy of gpu memory system for multi-application execution
As GPUs make headway in the computing landscape spanning mobile platforms,
supercomputers, cloud and virtual desktop platforms, supporting concurrent execution of …
supercomputers, cloud and virtual desktop platforms, supporting concurrent execution of …
Priority-based cache allocation in throughput processors
GPUs employ massive multithreading and fast context switching to provide high throughput
and hide memory latency. Multithreading can Increase contention for various system …
and hide memory latency. Multithreading can Increase contention for various system …