WARPP: A Toolkit for Simulating High Performance Parallel Scientific Codes
Hammond, S.D., Mudalige, G.R., Smith, J.A., Jarvis, S.A., Herdman, J.A. and Vadgama, A. (2009) WARPP: A Toolkit for Simulating High Performance Parallel Scientific Codes. In: 2nd International Conference on Simulation Tools and Techniques (SIMUTools09), Rome, Italy.
- Submitted Version
Download (207Kb) | Preview
There are a number of challenges facing the High Performance Computing (HPC) community, including increasing levels of concurrency (threads, cores, nodes), deeper and more complex memory hierarchies (register, cache, disk, network), mixed hardware sets (CPUs and GPUs) and increasing scale (tens or hundreds of thousands of processing elements). Assessing the performance of complex scientific applications on specialised high-performance computing architectures is difficult. In many cases, traditional computer benchmarking is insufficient as it typically requires access to physical machines of equivalent (or similar) specification and rarely relates to the potential capability of an application. A technique known as application performance modelling addresses many of these additional requirements. Modelling allows future architectures and/or applications to be explored in a mathematical or simulated setting, thus
enabling hypothetical questions relating to the configuration of a potential future architecture to be assessed in terms of its impact on key scientific codes.
This paper describes the Warwick Performance Prediction (WARPP) simulator, which is used to construct application performance models for complex industry-strength parallel scientific codes executing on thousands of processing cores. The capability and accuracy of the simulator is demonstrated through its application to a scientific benchmark developed by the United Kingdom Atomic Weapons Establishment (AWE). The results of the simulations are validated for two different HPC architectures, each case demonstrating a greater than 90% accuracy for run-time prediction. Simulation results, collected from runs on a standard PC, are provided for up to 65,000 processor cores. It is also shown how the addition of operating system jitter to the simulator can improve the quality of the application performance model results.
|Item Type:||Conference or Workshop Item (Paper)|
|Uncontrolled Keywords:||hpsg pcav warpp simulation performance modelling|
|Subjects:||Q Science > QA Mathematics > QA75 Electronic computers. Computer science|
|Divisions:||Faculty of Science > Computer Science|
|Depositing User:||Simon Hammond|
|Date Deposited:||06 Oct 2010 11:01|
|Last Modified:||23 Feb 2012 09:08|
Actions (login required)