Models of machines and computation for mapping in multicomputers

MG Norman, P Thanisch - ACM Computing Surveys (CSUR), 1993 - dl.acm.org
Nor M lt always easy to assess the relevance of a new result to a particular problem.
Furthermore, changes in parallel computing technology have made some of the earlier work …

Communication optimizations for irregular scientific computations on distributed memory architectures

R Das, M Uysal, J Saltz, YS Hwang - Journal of parallel and distributed …, 1994 - Elsevier
This paper describes a number of optimizations that can be used to support the efficient
execution of irregular problems on distributed memory parallel machines. These primitives …

Execution time support for adaptive scientific algorithms on distributed memory machines

H Berryman, J Saltz, J Scroggs - Concurrency: Practice and …, 1991 - Wiley Online Library
We consider optimizations that are required for efficient execution of code segments that
consist of loops over distributed data structures. The PARTI execution time primitives are …

Performance of the Intel iPSC/860 and Ncube 6400 hypercubes

TH Dunigan - Parallel Computing, 1991 - Elsevier
The performance of the Intel iPSC/860 hypercube and the Ncube 6400 hypercube are
compared with earlier hypercubes from Intel and Ncube. Computation and communication …

Complete exchange on a circuit switched mesh

SH Bokhari, H Berryman - Proceedings Scalable High …, 1992 - ieeexplore.ieee.org
The complete exchange ('all-to-all personalized') communication pattern is at the heart of
numerous important multicomputer algorithms. Recent research has shown how this pattern …

Multiprocessors and run‐time compilation

J Saltz, H Berryman, J Wu - Concurrency: Practice and …, 1991 - Wiley Online Library
Run‐time preprocessing plays a major role in many efficient algorithms in computer science,
as well as playing an important role in exploiting multiprocessor architectures. We give …

Circuit-switched broadcasting in torus networks

JG Peters, M Syska - IEEE Transactions on Parallel and …, 1996 - ieeexplore.ieee.org
In this paper we present three broadcast algorithms and lower bounds on the three main
components of the broadcast time for 2-dimensional torus networks (wrap-around meshes) …

An integrated runtime and compile-time approach for parallelizing structured and block structured applications

G Agrawal, A Sussman, J Saltz - IEEE Transactions on Parallel …, 1995 - ieeexplore.ieee.org
In compiling applications for distributed memory machines, runtime analysis is required
when data to be communicated cannot be determined at compile-time. One such class of …

Optimal orthogonal tiling of 2-D iterations

R Andonov, S Rajopadhye - Journal of Parallel and Distributed computing, 1997 - Elsevier
Iteration space tiling is a common strategy used by parallelizing compilers and in
performance tuning of parallel codes. We address the problem of determining the tile size …

Efficient technique for ellipse detection using restricted randomized Hough transform

Z Cheng, Y Liu - International Conference on Information …, 2004 - ieeexplore.ieee.org
We propose a new efficient method to detect ellipses in binary or gray-scale images, called
restricted randomized Hough transform (RRHT). The key of RRHT is to restrict the scope of …