Architectural support for task dependence management with flexible software scheduling

E Castillo, L Alvarez, M Moreto, M Casas… - … Symposium on High …, 2018 - ieeexplore.ieee.org
The growing complexity of multi-core architectures has motivated a wide range of software
mechanisms to improve the orchestration of parallel executions. Task parallelism has …

ATM: approximate task memoization in the runtime system

I Brumar, M Casas, M Moreto… - 2017 IEEE …, 2017 - ieeexplore.ieee.org
Redundant computations appear during the execution of real programs. Multiple factors
contribute to these unnecessary computations, such as repetitive inputs and patterns, calling …

Runtime-guided management of stacked DRAM memories in task parallel programs

L Alvarez, M Casas, J Labarta, E Ayguade… - Proceedings of the …, 2018 - dl.acm.org
Stacked DRAM memories have become a reality in High-Performance Computing (HPC)
architectures. These memories provide much higher bandwidth while consuming less power …

Td-nuca: runtime driven management of nuca caches in task dataflow programming models

P Caheny, L Alvarez, M Casas… - … Conference for High …, 2022 - ieeexplore.ieee.org
In high performance processors, the design of on-chip memory hierarchies is crucial for
performance and energy efficiency. Current processors rely on large shared Non-Uniform …

RADAR: Runtime-assisted dead region management for last-level caches

M Manivannan, V Papaefstathiou… - … Symposium on High …, 2016 - ieeexplore.ieee.org
Last-level caches (LLCs) bridge the processor/memory speed gap and reduce energy
consumed per access. Unfortunately, LLCs are poorly utilized because of the relatively large …

ParalOS: A scheduling & memory management framework for heterogeneous VPUs

E Petrongonas, V Leon, G Lentaris… - 2021 24th Euromicro …, 2021 - ieeexplore.ieee.org
Embedded systems are presented today with the challenge of a very rapidly evolving
application diversity followed by increased programming and computational complexity …

Runtime-assisted cache coherence deactivation in task parallel programs

P Caheny, L Alvarez, M Valero… - … Conference for High …, 2018 - ieeexplore.ieee.org
With increasing core counts, the scalability of directory-based cache coherence has become
a challenging problem. To reduce the area and power needs of the directory, recent …

Design-time memory subsystem optimization for low-power multi-core embedded systems

M Strobel, M Radetzki - … multicore/many-core systems-on-chip …, 2019 - ieeexplore.ieee.org
Embedded multi-core systems are increasingly in use. As established single-core design
methodologies are often not applicable out of the box, novel design-time optimization …

Explicit data layout management for autotuning exploration on complex memory topologies

S Perarnau, B Videau, N Denoyelle… - 2019 IEEE/ACM …, 2019 - ieeexplore.ieee.org
The memory topology of high-performance computing platforms is becoming more complex.
Future exascale platforms in particular are expected to feature multiple types of memory …

A visual analysis on recognizability and discriminability of onomatopoeia words with DCNN features

W Shimoda, K Yanai - 2015 IEEE International Conference on …, 2015 - ieeexplore.ieee.org
In this paper, we examine the relation between onomatopoeia and images using a large
number of Web images. The objective of this paper is to examine if the images …