The Effect of an Application Performance Modelling Tool
Smith, J.A., Hammond, S.D. and Jarvis, S.A. (2011) The Effect of an Application Performance Modelling Tool. In: Proceedings of the UK Performance Engineering Workshop (UKPEW'11), 7-8 July, 2011, Bradford, United Kingdom.
|PDF - Published Version |
Download (325Kb) | Preview
TOne of the most important metrics of machine efficiency in HPC is job turnaround time, which is the time taken for a user to submit a job and recieve their results. This time consists of two primary components; run-time, which depends on the resources allocated to the job, and queue wait time, which is dependent on the resources requested and the present level of machine usage.
This paper investigates the effect of applying application performance modelling techniques to producing run-time estimates for jobs to be scheduled on parallel High Performance Computing systems. Our aim through the development of such tools is to improve turnaround time for jobs across the system. This investigation is performed from the perspective of a community of HPC users who would make use of a tool to assist in job submission.
We implement and validate a higher performance implementation of the scheduling algorithm used by the Maui scheduler and demonstrate that it matches the behaviour of an existing Maui configuration. We formulate a method for generating potential performance model results and use it to to modify three workloads from production supercomputers to include generated performance model wall-time estimates. We then apply the scheduling simulator to the workloads in order to simulate the effect of using of such a tool for the three real workloads.
Examining the results from the simulation, we show a randomly selected sample set of tool users obtaining and improvement of upto 23% in average queuing time, and conclude that a tool that uses performance models to generate improved wall-time estimates would be beneficial to users of HPC systems.
|Item Type:||Conference or Workshop Item (Paper)|
|Uncontrolled Keywords:||pcav hpsg performance modelling scheduling distributed|
|Subjects:||Q Science > QA Mathematics > QA75 Electronic computers. Computer science|
Q Science > QA Mathematics > QA76 Computer software
|Divisions:||Faculty of Science > Computer Science|
|Depositing User:||Simon Hammond|
|Date Deposited:||31 May 2011 19:42|
|Last Modified:||23 Feb 2012 09:07|
Actions (login required)