Skip to main content

Grid reliability

G Maier P Saiz J Andreeva C Cirstoiu B Gaidioz J Herrala E J Maguire and R Rocha

Abstract

Thanks to the Grid, users have access to computing resources distributed all over the world. The Grid hides the complexity and the differences of its heterogeneous components. In such a distributed system, it is clearly very important that errors are detected as soon as possible, and that the procedure to solve them is well established. We focused on two of its main elements: the workload and the data management systems. We developed an application to investigate the efficiency of the different centres. Furthermore, our system can be used to categorize the most common error messages, and control their time evolution.

Journal
Journal of Physics
Year
2008