Programming Research Group Technical Report TR-33-97

Stability of communication performance in practice: from the Cray T3E to Networks of Workstations

Jonathan M.D. Hill, Stephen R. Donaldson, and David B. Skillicorn

Abstract:

The Bulk Synchronous Parallel model costs programs using three parameters, the processor speed (s), the network permeability (g), and the superstep overhead (l). This simple model is accurate over a wide variety of applications and parallel computers. However, all real parallel computers exhibit behaviour that is not captured by the cost model.

This paper is an extensive study of the accuracy and stability of g for a wide range of parallel computers. A series of parameterised kernel benchmarks are applied to each communication system, and the resulting values of g tabulated. The resulting values of g are normally distributed with small standard deviation across all of the architectures investigated.

There will always be some variation between a computer's behaviour in reality and that predicted by the cost model. We extend the cost model with a confidence parameter that describes the standard deviation in g as a function of applied traffic pattern. This can be regarded as providing parameterised kernel benchmarks for each architecture. Programmers can use this information to decide how much discrepancy they are likely to encounter when using the cost model in situations where this is critically important. In general, the results show that the BSP g parameter does accurately predict communication performance for real parallel computers.


This paper is available as a 344387 byte compressed PostScript file