Prodan, R. (2004). Experiment management, performance optimisation, and tool integration in grid computing [Dissertation, Technische Universität Wien]. reposiTUm. https://resolver.obvsg.at/urn:nbn:at:at-ubtuw:1-13116
The interest in computational Grids is increasingly growing as a mean of enabling the application developers to aggregate resources scattered around the globe for solving large-scale scientific problems.<br />Developing applications that can effectively utilise the Grid, however, still remains very difficult due to the lack of high-level tools to support developers. For instance, existing available performance analysis tools target single application execution, which is not sufficient for efficient performance tuning of parallel applications.<br />The thesis proposes a new directive-based language called ZEN for compact specification of wide value ranges for arbitrary application parameters, including problem or machine sizes, array or loop distributions, software libraries, interconnection networks, or execution machines.<br />Additionally, the ZEN directives can be used to specify a wide range of performance metrics to be collected from the application for arbitrary code regions.<br />Based on the ZEN language, the thesis proposes a novel experiment management tool called ZENTURIO for automatic experiment management in the context of large-scale performance and parameter studies on the Grid.<br />ZENTURIO has been designed as a distributed service-oriented architecture based on the latest Web and Grid services technologies. A variety of novel Web technology adaptations for Grid computing are presented.<br />ZENTURIO designs an optimisation framework that integrates general-purpose heuristics for solving NP-complete performance and parameter optimisation problems in a wide search space specified using the ZEN language.<br />The thesis proposes a new hybrid approach for scheduling workflow Grid applications, which combines static scheduling as an optimisation problem with dynamic steering based on the Grid resource availability and recursive loop handling.