Outerspace: An outer product based sparse matrix multiplication accelerator
Sparse matrices are widely used in graph and data analytics, machine learning, engineering
and scientific applications. This paper describes and analyzes OuterSPACE, an accelerator …
and scientific applications. This paper describes and analyzes OuterSPACE, an accelerator …
A survey of techniques for reducing interference in real-time applications on multicore platforms
T Lugo, S Lozano, J Fernández, J Carretero - IEEE Access, 2022 - ieeexplore.ieee.org
This survey reviews the scientific literature on techniques for reducing interference in real-
time multicore systems, focusing on the approaches proposed between 2015 and 2020. It …
time multicore systems, focusing on the approaches proposed between 2015 and 2020. It …
LATR: Lazy translation coherence
We propose LATR-lazy TLB coherence-a software-based TLB shootdown mechanism that
can alleviate the overhead of the synchronous TLB shootdown mechanism in existing …
can alleviate the overhead of the synchronous TLB shootdown mechanism in existing …
Stitch: Fusible heterogeneous accelerators enmeshed with many-core architecture for wearables
Wearable devices are now leveraging multi-core processors to cater to the increasing
computational demands of the applications via multi-threading. However, the power …
computational demands of the applications via multi-threading. However, the power …
Exploiting memory allocations in clusterised many‐core architectures
Power‐efficient architectures have become the most important feature required for future
embedded systems. Modern designs, like those released on mobile devices, reveal that …
embedded systems. Modern designs, like those released on mobile devices, reveal that …
Via: A smart scratchpad for vector units with application to sparse matrix computations
J Pavon, IV Valdivieso, A Barredo… - … Symposium on High …, 2021 - ieeexplore.ieee.org
Sparse matrix operations are critical kernels in multiple application domains such as High
Performance Computing, artificial intelligence and big data. Vector processing is widely …
Performance Computing, artificial intelligence and big data. Vector processing is widely …
Runtime-aware architectures
In the last few years, the traditional ways to keep the increase of hardware performance to
the rate predicted by the Moore's Law have vanished. When uni-cores were the norm …
the rate predicted by the Moore's Law have vanished. When uni-cores were the norm …
[HTML][HTML] COMPAD: A heterogeneous cache-scratchpad CPU architecture with data layout compaction for embedded loop-dominated applications
T Marinelli, JIG Pérez, C Tenllado, F Catthoor - Journal of Systems …, 2023 - Elsevier
The growing trend of pervasive computing has consolidated the everlasting need for power
efficient devices. The conventional cache subsystem of general-purpose CPUs, while being …
efficient devices. The conventional cache subsystem of general-purpose CPUs, while being …
Runtime-guided management of scratchpad memories in multicore architectures
The increasing number of cores and the anticipated level of heterogeneity in upcoming
multicore architectures cause important problems in traditional cache hierarchies. A good …
multicore architectures cause important problems in traditional cache hierarchies. A good …
RADAR: Runtime-assisted dead region management for last-level caches
M Manivannan, V Papaefstathiou… - … Symposium on High …, 2016 - ieeexplore.ieee.org
Last-level caches (LLCs) bridge the processor/memory speed gap and reduce energy
consumed per access. Unfortunately, LLCs are poorly utilized because of the relatively large …
consumed per access. Unfortunately, LLCs are poorly utilized because of the relatively large …