Task scheduling techniques for asymmetric multi-core systems

K Chronaki, A Rico, M Casas, M Moretó… - … on Parallel and …, 2016 - ieeexplore.ieee.org
As performance and energy efficiency have become the main challenges for next-
generation high-performance computing, asymmetric multi-core architectures can provide …

MUSA: a multi-level simulation approach for next-generation HPC machines

T Grass, C Allande, A Armejach, A Rico… - SC'16: Proceedings …, 2016 - ieeexplore.ieee.org
The complexity of High Performance Computing (HPC) systems is increasing in the number
of components and their heterogeneity. Interactions between software and hardware involve …

Reducing data movement on large shared memory systems by exploiting computation dependencies

I Sánchez Barrera, M Moretó, E Ayguadé… - Proceedings of the …, 2018 - dl.acm.org
Shared memory systems are becoming increasingly complex as they typically integrate
several storage devices. That brings different access latencies or bandwidth rates …

Architectural support for task dependence management with flexible software scheduling

E Castillo, L Alvarez, M Moreto, M Casas… - … Symposium on High …, 2018 - ieeexplore.ieee.org
The growing complexity of multi-core architectures has motivated a wide range of software
mechanisms to improve the orchestration of parallel executions. Task parallelism has …

General purpose task-dependence management hardware for task-based dataflow programming models

X Tan, J Bosch, M Vidal, C Álvarez… - 2017 IEEE …, 2017 - ieeexplore.ieee.org
Task-based programming models such as OpenMP, IntelTBB and OmpSs offer the
possibility of expressing dependences among tasks to drive their execution at runtime …

CATA: criticality aware task acceleration for multicore processors

E Castillo, M Moreto, M Casas, L Alvarez… - 2016 IEEE …, 2016 - ieeexplore.ieee.org
Managing criticality in task-based programming models opens a wide range of performance
and power optimization opportunities in future manycore systems. Criticality aware task …

ATM: approximate task memoization in the runtime system

I Brumar, M Casas, M Moreto… - 2017 IEEE …, 2017 - ieeexplore.ieee.org
Redundant computations appear during the execution of real programs. Multiple factors
contribute to these unnecessary computations, such as repetitive inputs and patterns, calling …

Reducing cache coherence traffic with a numa-aware runtime approach

P Caheny, L Alvarez, S Derradji… - … on Parallel and …, 2017 - ieeexplore.ieee.org
Cache Coherent NUMA (ccNUMA) architectures are a widespread paradigm due to the
benefits they provide for scaling core count and memory capacity. Also, the flat memory …

Runtime-guided management of stacked DRAM memories in task parallel programs

L Alvarez, M Casas, J Labarta, E Ayguade… - Proceedings of the …, 2018 - dl.acm.org
Stacked DRAM memories have become a reality in High-Performance Computing (HPC)
architectures. These memories provide much higher bandwidth while consuming less power …

Td-nuca: runtime driven management of nuca caches in task dataflow programming models

P Caheny, L Alvarez, M Casas… - … Conference for High …, 2022 - ieeexplore.ieee.org
In high performance processors, the design of on-chip memory hierarchies is crucial for
performance and energy efficiency. Current processors rely on large shared Non-Uniform …