Runtime-assisted cache coherence deactivation in task parallel programs

P Caheny, L Alvarez, M Casas… - … Conference for High …, 2022 - ieeexplore.ieee.org

In high performance processors, the design of on-chip memory hierarchies is crucial for
performance and energy efficiency. Current processors rely on large shared Non-Uniform …

被引用次数：6 相关文章所有 5 个版本

[PDF] um.es

Fine-grain data classification to filter token coherence traffic

BR Upadhyay, A Ros, M Supriya - Journal of Parallel and Distributed …, 2023 - Elsevier

Snoop-based cache coherence protocols perform well in small-scale systems by enabling
low latency cache-to-cache data transfers in just two-hop coherence transactions. However …

Novel techniques to improve the performance and the energy of vector architectures

A Barredo Ferreira - 2021 - upcommons.upc.edu

The rate of annual data generation grows exponentially. At the same time, there is a high
demand to analyze that information quickly. In the past, every processor generation came …

被引用次数：3 相关文章所有 3 个版本

[PDF] um.es

Efficient classification of private memory blocks

BR Upadhyay, A Ros, J Shah - Journal of Parallel and Distributed …, 2021 - Elsevier

Shared memory architectures are pervasive in the multicore technology era. Still, sequential
and parallel applications use most of the data as private in a multicore system. Recent …

被引用次数：1 相关文章所有 3 个版本

[PDF] um.es

TLB-based block-grain classification of private data

BR Upadhyay, A Ros, NS Murty - 2020 28th Euromicro …, 2020 - ieeexplore.ieee.org

Sequential and parallel applications use most of the data as private in a multi-core system.
Recent proposals made use of this observation to reduce the area of the coherence …

被引用次数：3 相关文章所有 4 个版本

[PDF] upc.edu

Exploiting data locality in cache-coherent NUMA systems

I Sánchez Barrera - 2022 - upcommons.upc.edu

The end of Dennard scaling has caused a stagnation of the clock frequency in computers.
To overcome this issue, in the last two decades vendors have been integrating larger …

被引用次数：1 相关文章所有 2 个版本

[PDF] upc.edu

Towards resource-aware computing for task-based runtimes and parallel architectures

D Chasapis - 2019 - upcommons.upc.edu

Current large scale systems show increasing power demands, to the point that it has
become a huge strain on facilities and budgets. The increasing restrictions in terms of power …

被引用次数：1 相关文章所有 6 个版本

[PDF] upc.edu

Exploiting task-based programming models for resilience

L Jaulmes - 2019 - upcommons.upc.edu

Hardware errors become more common as silicon technologies shrink and become more
vulnerable, especially in memory cells, which are the most exposed to errors. Permanent …

被引用次数：1 相关文章所有 7 个版本

[PDF] upc.edu

Runtime-assisted optimizations in the on-chip memory hierarchy

V Dimić - 2020 - upcommons.upc.edu

Following Moore's Law, the number of transistors on chip has been increasing
exponentially, which has led to the increasing complexity of modern processors. As a result …

Application of a default shared state cache coherency protocol

M Malewicki, T McGee, MS Woodacre - US Patent 11,687,459, 2023 - Google Patents

Example implementations relate to cache coherency protocols as applied to a memory block
range. Exclusive ownership of a range of blocks of memory in a default shared state may be …