Outerspace: An outer product based sparse matrix multiplication accelerator

S Pal, J Beaumont, DH Park… - … Symposium on High …, 2018 - ieeexplore.ieee.org
Sparse matrices are widely used in graph and data analytics, machine learning, engineering
and scientific applications. This paper describes and analyzes OuterSPACE, an accelerator …

A survey of techniques for reducing interference in real-time applications on multicore platforms

T Lugo, S Lozano, J Fernández, J Carretero - IEEE Access, 2022 - ieeexplore.ieee.org
This survey reviews the scientific literature on techniques for reducing interference in real-
time multicore systems, focusing on the approaches proposed between 2015 and 2020. It …

LATR: Lazy translation coherence

MK Kumar, S Maass, S Kashyap, J Veselý… - Proceedings of the …, 2018 - dl.acm.org
We propose LATR-lazy TLB coherence-a software-based TLB shootdown mechanism that
can alleviate the overhead of the synchronous TLB shootdown mechanism in existing …

Stitch: Fusible heterogeneous accelerators enmeshed with many-core architecture for wearables

C Tan, M Karunaratne, T Mitra… - 2018 ACM/IEEE 45th …, 2018 - ieeexplore.ieee.org
Wearable devices are now leveraging multi-core processors to cater to the increasing
computational demands of the applications via multi-threading. However, the power …

Exploiting memory allocations in clusterised many‐core architectures

R Garibotti, L Ost, A Butko, R Reis… - IET Computers & …, 2019 - Wiley Online Library
Power‐efficient architectures have become the most important feature required for future
embedded systems. Modern designs, like those released on mobile devices, reveal that …

Via: A smart scratchpad for vector units with application to sparse matrix computations

J Pavon, IV Valdivieso, A Barredo… - … Symposium on High …, 2021 - ieeexplore.ieee.org
Sparse matrix operations are critical kernels in multiple application domains such as High
Performance Computing, artificial intelligence and big data. Vector processing is widely …

Runtime-aware architectures

M Casas, M Moreto, L Alvarez, E Castillo… - Euro-Par 2015: Parallel …, 2015 - Springer
In the last few years, the traditional ways to keep the increase of hardware performance to
the rate predicted by the Moore's Law have vanished. When uni-cores were the norm …

[HTML][HTML] COMPAD: A heterogeneous cache-scratchpad CPU architecture with data layout compaction for embedded loop-dominated applications

T Marinelli, JIG Pérez, C Tenllado, F Catthoor - Journal of Systems …, 2023 - Elsevier
The growing trend of pervasive computing has consolidated the everlasting need for power
efficient devices. The conventional cache subsystem of general-purpose CPUs, while being …

Runtime-guided management of scratchpad memories in multicore architectures

L Alvarez, M Moretó, M Casas, E Castillo… - 2015 International …, 2015 - ieeexplore.ieee.org
The increasing number of cores and the anticipated level of heterogeneity in upcoming
multicore architectures cause important problems in traditional cache hierarchies. A good …

RADAR: Runtime-assisted dead region management for last-level caches

M Manivannan, V Papaefstathiou… - … Symposium on High …, 2016 - ieeexplore.ieee.org
Last-level caches (LLCs) bridge the processor/memory speed gap and reduce energy
consumed per access. Unfortunately, LLCs are poorly utilized because of the relatively large …