Cloud computing landscape and research challenges regarding trust and reputation
Cloud Computing is an emerging computing paradigm. It shares massively scalable, elastic
resources (eg, data, calculations, and services) transparently among the users over a …
resources (eg, data, calculations, and services) transparently among the users over a …
A modern primer on processing in memory
Modern computing systems are overwhelmingly designed to move data to computation. This
design choice goes directly against at least three key trends in computing that cause …
design choice goes directly against at least three key trends in computing that cause …
Processing data where it makes sense: Enabling in-memory computation
Today's systems are overwhelmingly designed to move data to computation. This design
choice goes directly against at least three key trends in systems that cause performance …
choice goes directly against at least three key trends in systems that cause performance …
Transparent offloading and mapping (TOM) enabling programmer-transparent near-data processing in GPU systems
Main memory bandwidth is a critical bottleneck for modern GPU systems due to limited off-
chip pin bandwidth. 3D-stacked memory architectures provide a promising opportunity to …
chip pin bandwidth. 3D-stacked memory architectures provide a promising opportunity to …
Scheduling techniques for GPU architectures with processing-in-memory capabilities
Processing data in or near memory (PIM), as opposed to in conventional computational units
in a processor, can greatly alleviate the performance and energy penalties of data transfers …
in a processor, can greatly alleviate the performance and energy penalties of data transfers …
Figaro: Improving system performance via fine-grained in-dram data relocation and caching
Main memory, composed of DRAM, is a performance bottleneck for many applications, due
to the high DRAM access latency. In-DRAM caches work to mitigate this latency by …
to the high DRAM access latency. In-DRAM caches work to mitigate this latency by …
Neither more nor less: Optimizing thread-level parallelism for GPGPUs
General-purpose graphics processing units (GPG-PUs) are at their best in accelerating
computation by exploiting abundant thread-level parallelism (TLP) offered by many classes …
computation by exploiting abundant thread-level parallelism (TLP) offered by many classes …
[PDF][PDF] Research problems and opportunities in memory systems
O Mutlu, L Subramanian - Supercomputing frontiers and …, 2014 - superfri.susu.ru
The memory system is a fundamental performance and energy bottleneck in almost all
computing systems. Recent system design, application, and technology trends that require …
computing systems. Recent system design, application, and technology trends that require …
Mosaic: a GPU memory manager with application-transparent support for multiple page sizes
R Ausavarungnirun, J Landgraf, V Miller… - Proceedings of the 50th …, 2017 - dl.acm.org
Contemporary discrete GPUs support rich memory management features such as virtual
memory and demand paging. These features simplify GPU programming by providing a …
memory and demand paging. These features simplify GPU programming by providing a …
Improving GPGPU resource utilization through alternative thread block scheduling
High performance in GPGPU workloads is obtained by maximizing parallelism and fully
utilizing the available resources. The thousands of threads are assigned to each core in …
utilizing the available resources. The thousands of threads are assigned to each core in …