D3: Invasion for High-Performance Computing
Principal Investigators:
Prof. H.-J. Bungartz, Prof. M. Gerndt, Prof. M. Bader
Scientific Researchers:
Isaías Comprés A. Hollmann, A. Mo-Hellenbrand, M. Schreiber, Dr. J. Weidendorfer
Abstract
The research agenda of D3 is threefold: First, we consider numerical core routines widespread in both supercomputing and embedded applications with respect to an invasive enhancement. Second, we investigate advantages of invasive computing on HPC systems by integrating such invasive algorithms into real-life simulation scenarios. Third, concepts are developed to support invasion in standard HPC programming models on state-of-the-art HPC systems.
In the first funding phase, we demonstrated that invasion is a promising paradigm for a wide range of highly relevant numerical algorithms by defining requirements for a set of representative numerical building blocks. Furthermore, invasion allows for more flexible resource usage of HPC architectures by extending the standard programming model for shared-memory systems with invasive run-time mechanisms dynamically redistributing cores iOMP. This leads to benefits for our target scenario based on dynamic adaptive mesh refinement (DAMR) as used in our tsunami simulation. Indeed, it has been shown that executing multiple invasive DAMR simulations in parallel leads to improved utilisation of hardware and, thus, to increased efficiency.
Synopsis
All three directions mentioned above (applications, algorithms, and programming models and resource management) will be extended towards large-scale distributed and heterogeneous systems.
First, concerning large-scale invasive applications, we will start from our successful tsunami case study and explore how invasive computing can provide innovative solutions to issues currently not solvable on large-scale systems in a satisfactory way. This includes efficient dynamic load distribution for optimising the application throughput, optimisations of energy efficiency, urgent computing and time-to-solution considerations, and flexible dynamic job scheduling on HPC systems.
Second, concerning invasive resource management, the programming model on HPC systems is typically a combination of MPI and OpenMP with, possibly, a programming interface for accelerators. One obvious new requirement is the explicit distribution of data across the nodes. For that, we will extend MPI for invasive computing and develop a scalable resource management infrastructure. This infrastructure will be based on the iOMP resource manager that will perform a model-based multi-objective optimisation.
Third, with respect to our algorithmic developments, we will extend our contributions to topics such as claim specification, reconfigurable hardware, TCPA-accelerated computations, and dark silicon. We will do that looking into matrix exponentials, direct solvers, and dynamic tree traversals as core routines relevant for embedded applications, as well as having a closer look at more general multi-level solvers as an algorithm class of utmost importance both for HPC and embedded systems such as MPSoCs (multiprocessor systems-on-a-chip).
Approach
The overall research goals of D3 are, first, to contribute to the development of the invasive core language and the invasive x10 framework via selected state-of-the-art numerical algorithms widespread in both supercomputing and embedded applications that show a high potential for benefiting from invasive computing; second, to demonstrate how this paradigm can be transferred to and exploited on HPC systems by integrating invasive algorithms into realistic large-scale simulation scenarios; and, third, to develop concepts to support invasion in standard HPC programming models on state-of-the-art HPC systems. In the first funding phase, as depicted in the previous section, the three main research threads towards these goals were the development of invasive numerical core algorithms for invasive MPSoCs, the development of our framework for tsunami simulations as the demonstrator HPC application for invasion, and the integration of invasion into OpenMP as the standard programming model for shared-memory systems.While these threads will be continued, the focus of the second funding phase will be on the extension of our work from shared-memory systems to heterogeneous-hybrid and large-scale systems, which leads to new demands on invasive computing on the application, algorithmic, and programming model level.
  Tsunami Simulation for Tohoku Tsunami (In Germany known as the 'Fukushima Tsunami')
    More information about simulation framework is available at Sierpinski
  
 
  For Tsunami parameter studies (see explanatory video above), a dynamic resource management driven by the invasive computing paradigms leads to significant efficiency improvements for representative simulations.
Invasive applications with HPC Starting from the successful tsunami case study, we will explore how invasive computing can provide innovative solutions to HPC issues currently not solvable on a large-scale system in a satisfactory way. This includes efficient dynamic load distribution on large-scale distributed systems for optimising throughput of applications, optimisations of energy efficiency, urgent computing and time-to-solution studies, and, in general, flexible dynamic job scheduling on HPC systems demanding for resource-aware algorithms.
Preview: Experimental Tsunami Simulation for Tohoku Tsunami on the earth globe
Invasive numerical algorithms on invasive MPSoCs Our experiences with and collections of invasive numerical core routines will be used to extend our contributions to topics such as claim specifications, reconfigurable hardware on instruction set and memory hierarchy level with iCores, TCPA-accelerated computations, dark silicon, and, in particular, the evaluation of invasive computing on the demonstrator hardware. On the one hand, the focus will be on matrix exponentials, direct solvers and dynamic tree traversals as core routines that are most relevant for embedded applications and that allow for innovative invasive implementations. On the other hand, we will concentrate on multi-level solvers extending the prototypical V-cycle scheme from the first funding period as an algorithm class of utmost importance both for large-scale PDE-based simulation scenarios and for embedded systems (image processing, \eg).
Invasive resource management for HPC Applications on large-scale parallel systems are typically parallelised by combining MPI with OpenMP and possibly a programming interface for accelerators. In contrast to mere shared-memory systems, the application data are explicitly distributed across the nodes, and memory in a node is a scarce resource. It is our goal to provide an invasive resource management for such applications. New challenges are to allow for invasive programming in MPI, to support applications in their data redistribution strategies, to increase the scalability of the resource manager with respect to the size of the respective applications, and to build up application models in the resource manager that support complex resource constraints, different classes of applications, detailed performance hints, and heterogeneous execution.
 
  iMPI Resource Management Example
A comprehensive summary of the major achievements of the first funding phase can be found by accessing Project D3 first phase website.
Publications
| [1] | Nidhi Anantharajaiah, Tamim Asfour, Michael Bader, Lars Bauer, Jürgen Becker, Simon Bischof, Marcel Brand, Hans-Joachim Bungartz, Christian Eichler, Khalil Esper, Joachim Falk, Nael Fasfous, Felix Freiling, Andreas Fried, Michael Gerndt, Michael Glaß, Jeferson Gonzalez, Frank Hannig, Christian Heidorn, Jörg Henkel, Andreas Herkersdorf, Benedict Herzog, Jophin John, Timo Hönig, Felix Hundhausen, Heba Khdr, Tobias Langer, Oliver Lenke, Fabian Lesniak, Alexander Lindermayr, Alexandra Listl, Sebastian Maier, Nicole Megow, Marcel Mettler, Daniel Müller-Gritschneder, Hassan Nassar, Fabian Paus, Alexander Pöppl, Behnaz Pourmohseni, Jonas Rabenstein, Phillip Raffeck, Martin Rapp, Santiago Narváez Rivas, Mark Sagi, Franziska Schirrmacher, Ulf Schlichtmann, Florian Schmaus, Wolfgang Schröder-Preikschat, Tobias Schwarzer, Mohammed Bakr Sikal, Bertrand Simon, Gregor Snelting, Jan Spieck, Akshay Srivatsa, Walter Stechele, Jürgen Teich, Furkan Turan, Isaías A. Comprés Ureña, Ingrid Verbauwhede, Dominik Walter, Thomas Wild, Stefan Wildermann, Mario Wille, Michael Witterauf, and Li Zhang. Invasive Computing. FAU University Press, August 16, 2022. [ DOI ] | 
| [2] | Xingfu Wu, Aniruddha Marathe, Siddhartha Jana, Ondrej Vysocky, Jophin John, Andrea Bartolini, Lubomir Riha, Michael Gerndt, Valerie Taylor, and Sridutt Bhalachandra. Toward an end-to-end auto-tuning framework in hpc powerstack. In Energy Efficient HPC State of Practice 2020, 2020. accepted for publication. | 
| [3] | Mohak Chadha, Jophin John, and Michael Gerndt. Extending slurm for dynamic resource-aware adaptive batch scheduling. In IEEE International Conference on High Performance Computing, Data, and Analytics (HiPC) 2020, 2020. accepted for publication. | 
| [4] | Ao Mo-Hellenbrand. Resource-Aware and Elastic Parallel Software Development for Distributed-Memory HPC Systems. Dissertation, Technische Universität München, Munich, 2019. [ http ] | 
| [5] | Jophin John, Santiago Narvaez R, and Michael Gerndt. Invasive computing for power corridor management. In Proceedings of the ParCo 2019: International Conference on Parallel Computing, 2019. accepted for publication. | 
| [6] | Mohak Chadha and Michael Gerndt. Modelling dvfs and ufs for region-based energy aware tuning of hpc applications. In IEEE International Parallel and Distributed Processing Symposium (IPDPS 2019), 2019. | 
| [7] | Jeeta Ann Chacko, Isaías Alberto Comprés Ureña, and Michael Gerndt. Integration of apache spark with invasive resource manager. In 2019 IEEE SmartWorld, Ubiquitous Intelligence and Computing, Advanced and Trusted Computing, Scalable Computing and Communications, Cloud and Big Data Computing, Internet of People and Smart City Innovation, 2019. Best Paper Award. | 
| [8] | Santiago Narvaez. Power model for resource-elastic applications. Master thesis, Technische Universität München, Munich, 2018. [ http ] | 
| [9] | Carsten Uphoff, Sebastian Rettenberger, Michael Bader, Stephanie Wollherr, Thomas Ulrich, Elizabeth H. Madden, and Alice-Agnes Gabriel. Extreme scale multi-physics simulations of the tsunamigenic 2004 Sumatra megathrust earthquake. In SC17: The International Conference for High Performance Computing, Networking, Storage and Analysis Proceedings. ACM, 2017. [ DOI ] | 
| [10] | Ao Mo-Hellenbrand, Isaías Comprés, Oliver Meister, Hans-Joachim
  Bungartz, Michael Gerndt, and Michael Bader.
 A large-scale malleable tsunami simulation realized on an elastic
  MPI infrastructure.
 In Proceedings of the Computing Frontiers Conference (CF),
  pages 271–274. ACM, 2017.
[ DOI ] Keywords: Malleable Applications, Elastic Computing, Resource Aware Computing, Adaptive Mesh Refinement, Elastic MPI, Message Passing, SLURM, Autonomous Resource Management | 
| [11] | Isaías Alberto Comprés Ureña. Resource-Elasticity Support for Distributed Memory HPC Applications. Dissertation, Technical University of Munich, Munich, 2017. [ arXiv | http ] | 
| [12] | Jürgen Teich. Invasive computing – editorial. it – Information Technology, 58(6):263–265, November 24, 2016. [ DOI ] | 
| [13] | Stefan Wildermann, Michael Bader, Lars Bauer, Marvin Damschen, Dirk Gabriel, Michael Gerndt, Michael Glaß, Jörg Henkel, Johny Paul, Alexander Pöppl, Sascha Roloff, Tobias Schwarzer, Gregor Snelting, Walter Stechele, Jürgen Teich, Andreas Weichslgartner, and Andreas Zwinkau. Invasive computing for timing-predictable stream processing on MPSoCs. it – Information Technology, 58(6):267–280, September 30, 2016. [ DOI ] | 
| [14] | Weifeng Liu, Michael Gerndt, and Bin Gong.
 Model-based MPI-IO tuning with Periscope tuning framework.
 Concurrency and Computation: Practice and Experience,
  28(1):3–20, 2016.
[ DOI ] Keywords: parallel I/O, automatic tuning, MPI-IO, performance model, high-performance computing | 
| [15] | Hans Michael Gerndt, Michael Glaß, Sri Parameswaran, and Barry L. Rountree. Dark Silicon: From Embedded to HPC Systems (Dagstuhl Seminar 16052). Dagstuhl Reports, 6(1):224–244, 2016. [ DOI ] | 
| [16] | Martin Schreiber, Christoph Riesinger, Tobias Neckel, Hans-Joachim Bungartz, and Alexander Breuer. Invasive compute balancing for applications with shared and hybrid parallelization. International Journal of Parallel Programming, September 2014. [ DOI ] | 
| [17] | Carsten Tradowsky, Martin Schreiber, Malte Vesper, Ivan Domladovec, Maximilian Braun, Hans-Joachim Bungartz, and Jürgen Becker. Towards dynamic cache and bandwidth invasion. In Reconfigurable Computing: Architectures, Tools, and Applications, volume 8405 of Lecture Notes in Computer Science, pages 97–107. Springer International Publishing, April 2014. [ DOI ] | 
| [18] | Martin Schreiber. Cluster-Based Parallelization of Simulations on Dynamically Adaptive Grids and Dynamic Resource Management. Dissertation, Institut für Informatik, Technische Universität München, January 2014. [ .pdf ] | 
| [19] | Martin Schreiber, Tobias Weinzierl, and Hans-Joachim Bungartz. Sfc-based communication metadata encoding for adaptive mesh refinement. In Michael Bader, editor, Proceedings of the International Conference on Parallel Computing (ParCo), October 2013. | 
| [20] | Martin Schreiber, Christoph Riesinger, Tobias Neckel, and Hans-Joachim Bungartz. Invasive compute balancing for applications with hybrid parallelization. In Proceedings of the International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD). IEEE, October 2013. | 
| [21] | Martin Schreiber, Tobias Weinzierl, and Hans-Joachim Bungartz. Cluster optimization and parallelization of simulations with dynamically adaptive grids. In Euro-Par 2013, August 2013. | 
| [22] | Hans-Joachim Bungartz, Christoph Riesinger, Martin Schreiber, Gregor Snelting, and Andreas Zwinkau. Invasive computing in HPC with X10. In X10 Workshop (X10'13), X10 '13, pages 12–19, New York, NY, USA, 2013. ACM. [ DOI ] | 
| [23] | Michael Gerndt, Andreas Hollmann, Marcel Meyer, Martin Schreiber, and Josef Weidendorfer. Invasive computing with iOMP. In Proceedings of the Forum on Specification and Design Languages (FDL), pages 225–231, September 2012. | 
| [24] | Isaías A. Comprés Ureña, Michael Riepen, Michael Konow, and Michael Gerndt. Invasive MPI on intel's single-chip cloud computer. In Andreas Herkersdorf, Kay Römer, and Uwe Brinkschulte, editors, Proceedings of the 25th International Conference on Architecture of Computing System (ARCS), volume 7179 of Lecture Notes in Computer Science, pages 74–85. Springer, February 2012. [ DOI ] | 
| [25] | Martin Schreiber, Hans-Joachim Bungartz, and Michael Bader. Shared memory parallelization of fully-adaptive simulations using a dynamic tree-split and -join approach. In Proceedings of HiPC 2012, pages 1–10. IEEE, 2012. | 
| [26] | Andreas Hollmann and Michael Gerndt.
 Invasive computing: An application assisted resource management
  approach.
 In Victor Pankratius and Michael Philippsen, editors, Multicore
  Software Engineering, Performance, and Tools, volume 7303 of Lecture
  Notes in Computer Science, pages 82–85. Springer Berlin Heidelberg, 2012.
[ DOI ] Keywords: resource management; resource awareness; numa; parallel programming; OpenMP | 
| [27] | Michael Bader, Hans-Joachim Bungartz, and Martin Schreiber. Invasive computing on high performance shared memory systems. In Facing the Multicore-Challenge III, volume 7686 of Lecture Notes in Computer Science, pages 1–12, 2012. | 
| [28] | Michael Bader, Hans-Joachim Bungartz, Michael Gerndt, Andreas Hollmann, and Josef Weidendorfer. Invasive programming as a concept for HPC. In Proceedings of the 10th IASTED International Conference on Parallel and Distributed Computing and Networks 2011 (PDCN), February 2011. [ DOI ] | 
| [29] | Jürgen Teich, Jörg Henkel, Andreas Herkersdorf, Doris Schmitt-Landsiedel, Wolfgang Schröder-Preikschat, and Gregor Snelting. Invasive computing: An overview. In Michael Hübner and Jürgen Becker, editors, Multiprocessor System-on-Chip – Hardware Design and Tool Integration, pages 241–268. Springer, Berlin, Heidelberg, 2011. [ DOI ] | 
| [30] | Hans-Joachim Bungartz, Bernhard Gatzhammer, Michael Lieb, Miriam Mehl, and Tobias Neckel. Towards multi-phase flow simulations in the PDE framework peano. Computational Mechanics, 48(3):365–376, 2011. [ .pdf ] | 
| [31] | Jürgen Teich. Invasive algorithms and architectures. it - Information Technology, 50(5):300–310, 2008. | 
| [32] | Andreas Hollmann and Michael Gerndt. Invasive computing: An application assisted resource management approach. In MSEPT, pages 82–85. [ DOI ] | 
| [33] | Andreas Hollmann and Michael Gerndt. iOMP language specification 1.0. Internal Report. | 
| [34] | Isaías A. Comprés Ureña and Michael Gerndt. Improved RCKMPI's SCCMPB channel: Scaling and dynamic processes support. 4th MARC Symposium. | 

