External memory algorithms and data structures: Dealing with massive data

JS Vitter - ACM Computing surveys (CsUR), 2001 - dl.acm.org
Data sets in large applications are often too massive to fit completely inside the computers
internal memory. The resulting input/output communication (or I/O) between fast internal …

CaImAn an open source tool for scalable calcium imaging data analysis

A Giovannucci, J Friedrich, P Gunn, J Kalfon, BL Brown… - elife, 2019 - elifesciences.org
Advances in fluorescence microscopy enable monitoring larger brain areas in-vivo with finer
time resolution. The resulting data rates require reproducible analysis pipelines that are …

{GraphChi}:{Large-Scale} graph computation on just a {PC}

A Kyrola, G Blelloch, C Guestrin - 10th USENIX symposium on operating …, 2012 - usenix.org
Current systems for graph computation require a distributed computing cluster to handle
very large real-world problems, such as analysis on social networks or the web graph. While …

[图书][B] Algorithms and theory of computation handbook, volume 2: special topics and techniques

MJ Atallah, M Blanton - 2009 - books.google.com
This handbook provides an up-to-date compendium of fundamental computer science
topics, techniques, and applications. Along with updating and revising many of the existing …

Red-blue pebbling revisited: near optimal parallel matrix-matrix multiplication

G Kwasniewski, M Kabić, M Besta… - Proceedings of the …, 2019 - dl.acm.org
We propose COSMA: a parallel matrix-matrix multiplication algorithm that is near
communication-optimal for all combinations of matrix dimensions, processor counts, and …

Communication lower bounds for distributed-memory matrix multiplication

D Irony, S Toledo, A Tiskin - Journal of Parallel and Distributed Computing, 2004 - Elsevier
We present lower bounds on the amount of communication that matrix multiplication
algorithms must perform on a distributed-memory parallel computer. We denote the number …

Overview–Parallel Computing: Numerics, Applications, and Trends

M Vajteršic, P Zinterhof, R Trobec - Parallel Computing: Numerics …, 2009 - Springer
This book is intended for researchers and practitioners as a foundation for modern parallel
computing with several of its important parallel applications, and also for students as a basic …

Algorithms and data structures for external memory

JS Vitter - … and Trends® in Theoretical Computer Science, 2008 - nowpublishers.com
Data sets in large applications are often too massive to fit completely inside the computer's
internal memory. The resulting input/output communication (or I/O) between fast internal …

Efficient gradient-domain compositing using quadtrees

A Agarwala - ACM Transactions on Graphics (TOG), 2007 - dl.acm.org
We describe a hierarchical approach to improving the efficiency of gradient-domain
compositing, a technique that constructs seamless composites by combining the gradients of …

Programming matrix algorithms-by-blocks for thread-level parallelism

G Quintana-Ortí, ES Quintana-Ortí… - ACM Transactions on …, 2009 - dl.acm.org
With the emergence of thread-level parallelism as the primary means for continued
performance improvement, the programmability issue has reemerged as an obstacle to the …