Programming Research Group Technical Report TR-1-97

BSP Scheduling of Regular Patterns of Computation

Radu Calinescu

Abstract:

One of the major challenges of the current research in the field of parallel computing is the development of a realistic underlying framework for the design and programming of general purpose parallel computers. The bulk-synchronous parallel (BSP) model is largely viewed as the most suitable candidate for this role, as it offers support for both the design of scalable parallel architectures and the generation of portable parallel code.

However, when considering the development of portable parallel software within the framework of the BSP model, one cannot disregard the existence of a broad basis of efficient sequential and PRAM solutions for the most various classes of problems. In fact, the recent emergence of reliable techniques for the identification of the potential parallelism of a sequential program has rendered the automatic parallelisation of existing sequential code more compelling than ever.

At first sight, BSP simulation of PRAMs appears to be the ideal strategy for taking advantage of this wealth of potential parallelism. Unfortunately, PRAM simulation heavily relies on the future manufacturing of parallel architectures providing powerful multithreading support and fast context switching/address translation capabilities. At the same time, simulation has been proven to yield inefficient solutions when computations on dense data structures are involved.

This report describes the first stage of a project proposing BSP scheduling as an alternative to PRAM simulation in the generation of portable parallel code. After introducing the BSP programming and cost model, the report presents a brief overview of the current trends in the identification and exploitation of potential parallelism. Then, techniques for the BSP scheduling of many regular patterns of computation which occur frequently in imperative programs are devised and analysed in terms of the BSP cost model. Finally, a whole chapter is dedicated to the mapping of generic loop nests on BSP computers. A discussion of the further stages of the project concludes the report.


This paper is available as a 276,655 byte compressed PostScript file.