Td-nuca: runtime driven management of nuca caches in task dataflow programming models

P Caheny, L Alvarez, M Casas… - … Conference for High …, 2022 - ieeexplore.ieee.org
In high performance processors, the design of on-chip memory hierarchies is crucial for
performance and energy efficiency. Current processors rely on large shared Non-Uniform …

Fine-grain data classification to filter token coherence traffic

BR Upadhyay, A Ros, M Supriya - Journal of Parallel and Distributed …, 2023 - Elsevier
Snoop-based cache coherence protocols perform well in small-scale systems by enabling
low latency cache-to-cache data transfers in just two-hop coherence transactions. However …

Novel techniques to improve the performance and the energy of vector architectures

A Barredo Ferreira - 2021 - upcommons.upc.edu
The rate of annual data generation grows exponentially. At the same time, there is a high
demand to analyze that information quickly. In the past, every processor generation came …

Efficient classification of private memory blocks

BR Upadhyay, A Ros, J Shah - Journal of Parallel and Distributed …, 2021 - Elsevier
Shared memory architectures are pervasive in the multicore technology era. Still, sequential
and parallel applications use most of the data as private in a multicore system. Recent …

TLB-based block-grain classification of private data

BR Upadhyay, A Ros, NS Murty - 2020 28th Euromicro …, 2020 - ieeexplore.ieee.org
Sequential and parallel applications use most of the data as private in a multi-core system.
Recent proposals made use of this observation to reduce the area of the coherence …

Exploiting data locality in cache-coherent NUMA systems

I Sánchez Barrera - 2022 - upcommons.upc.edu
The end of Dennard scaling has caused a stagnation of the clock frequency in computers.
To overcome this issue, in the last two decades vendors have been integrating larger …

Towards resource-aware computing for task-based runtimes and parallel architectures

D Chasapis - 2019 - upcommons.upc.edu
Current large scale systems show increasing power demands, to the point that it has
become a huge strain on facilities and budgets. The increasing restrictions in terms of power …

Exploiting task-based programming models for resilience

L Jaulmes - 2019 - upcommons.upc.edu
Hardware errors become more common as silicon technologies shrink and become more
vulnerable, especially in memory cells, which are the most exposed to errors. Permanent …

Runtime-assisted optimizations in the on-chip memory hierarchy

V Dimić - 2020 - upcommons.upc.edu
Following Moore's Law, the number of transistors on chip has been increasing
exponentially, which has led to the increasing complexity of modern processors. As a result …

Application of a default shared state cache coherency protocol

M Malewicki, T McGee, MS Woodacre - US Patent 11,687,459, 2023 - Google Patents
Example implementations relate to cache coherency protocols as applied to a memory block
range. Exclusive ownership of a range of blocks of memory in a default shared state may be …