Automatic parallelization of a class of irregular loops for distributed memory systems

R Khatchadourian, Y Tang, M Bagherzadeh - Science of Computer …, 2020 - Elsevier

Streaming APIs are becoming more pervasive in mainstream Object-Oriented programming
languages and platforms. For example, the Stream API introduced in Java 8 allows for …

被引用次数：56 相关文章所有 8 个版本

[PDF] futhark-lang.org

Incremental flattening for nested data parallelism

T Henriksen, F Thorøe, M Elsman… - Proceedings of the 24th …, 2019 - dl.acm.org

Compilation techniques for nested-parallel applications that can adapt to hardware and
dataset characteristics are vital for unlocking the power of modern hardware. This paper …

被引用次数：52 相关文章所有 7 个版本

[PDF] hal.science

Efficient tiled sparse matrix multiplication through matrix signatures

SE Kurt, A Sukumaran-Rajam… - … Conference for High …, 2020 - ieeexplore.ieee.org

Tiling is a key technique to reduce data movement in matrix computations. While tiling is well
understood and widely used for dense matrix/tensor computations, effective tiling of sparse …

被引用次数：32 相关文章所有 12 个版本

[PDF] ufmg.br

Automatic annotation of tasks in structured code

P Ramos, G Souza, D Soares, G Araújo… - Proceedings of the 27th …, 2018 - dl.acm.org

This paper describes the design and implementation of a suit of static analyses and code
generation techniques to annotate programs with OpenMP pragmas for task parallelism …

被引用次数：13 相关文章所有 4 个版本

Fast multiplication of random dense matrices with sparse matrices

T Liang, R Murray, A Buluç… - 2024 IEEE International …, 2024 - ieeexplore.ieee.org

This work focuses on accelerating the multiplication of a dense random matrix with a (fixed)
sparse matrix, which is frequently used in sketching algorithms. We develop a novel scheme …

[PDF] arxiv.org

Fast multiplication of random dense matrices with fixed sparse matrices

T Liang, R Murray, A Buluç, J Demmel - arXiv preprint arXiv:2310.15419, 2023 - arxiv.org

This work focuses on accelerating the multiplication of a dense random matrix with a (fixed)
sparse matrix, which is frequently used in sketching algorithms. We develop a novel scheme …

[PDF][PDF] Lecture Notes for the Software Track of the PMPH Course

CE Oancea - Programming Massively Parallel Hardware, 2018 - hjemmesider.diku.dk

We then will turn our attention to legacy-sequential code written in programming languages
such as C. In this context we study dependence analysis, as a tool for reasoning about loop …

被引用次数：2 相关文章

Modeling Data Movement for Sparse Matrix and Tensor Computations

SE Kurt - 2022 - search.proquest.com

Sparse matrix and tensor computations are challenging to optimize. In contrast to dense
matrix/tensor computations, the pattern of data access is typically irregular for sparse …

A functional approach to accelerating Monte Carlo based american option pricing

WM Pawlak, M Elsman, CE Oancea - Proceedings of the 31st …, 2019 - dl.acm.org

We study the feasibility and performance efficiency of expressing a complex financial
numerical algorithm with high-level functional parallel constructs. The algorithm we …

被引用次数：2 相关文章所有 5 个版本

[PDF] ufmg.br

Taskminer: Automatic identification of tasks

P Ramos, G Souza, G Leobas… - Proceedings of the XXII …, 2018 - dl.acm.org

This paper presents TaskMiner, a tool that automatically finds task parallelism in C code.
TaskMiner solves classic problems of irregular parallelism, such as finding the memory …

被引用次数：1 相关文章所有 3 个版本