    Überblick zur Softwareentwicklung in Wissenschaftlichen Anwendungen

    Viele wissenschaftliche Disziplinen mĂŒssen heute immer komplexer werdende numerische Probleme lösen. Die KomplexitĂ€t der benutzten wissenschaftlichen Software steigt dabei kontinuierlich an. Diese KomplexitĂ€tssteigerung wird durch eine ganze Reihe sich Ă€ndernder Anforderungen verursacht: Die Betrachtung gekoppelter PhĂ€nomene gewinnt Aufmerksamkeit und gleichzeitig mĂŒssen neue Technologien wie das Grid-Computing oder neue Multiprozessorarchitekturen genutzt werden, um weiterhin in angemessener Zeit zu Berechnungsergebnissen zu kommen. Diese FĂŒlle an neuen Anforderungen kann nicht mehr von kleinen spezialisierten Wissenschaftlergruppen in Isolation bewĂ€ltigt werden. Die Entwicklung wissenschaftlicher Software muss vielmehr in interdisziplinĂ€ren Gruppen geschehen, was neue Herausforderungen in der Softwareentwicklung induziert. Ein Paradigmenwechsel zu einer stĂ€rkeren Separation von Verantwortlichkeiten innerhalb interdisziplinĂ€rer Entwicklergruppen ist bis jetzt in vielen FĂ€llen nur in AnsĂ€tzen erkennbar. Die Kopplung partitioniert durchgefĂŒhrter Simulationen physikalischer PhĂ€nomene ist ein wichtiges Beispiel fĂŒr softwaretechnisch herausfordernde Aufgaben im Gebiet des wissenschaftlichen Rechnens. In diesem Kontext modellieren verschiedene Simulationsprogramme unterschiedliche Teile eines komplexeren gekoppelten Systems. Die vorliegende Arbeit gibt einen Überblick ĂŒber Paradigmen, die darauf abzielen Softwareentwicklung fĂŒr Berechnungsprogramme verlĂ€sslicher und weniger abhĂ€ngig voneinander zu machen. Ein spezielles Augenmerk liegt auf der Entwicklung gekoppelter Simulationen.Fields of modern science and engineering are in need of solving more and more complex numerical problems. The complexity of scientiïŹc software thereby rises continuously. This growth is caused by a number of changing requirements. Coupled phenomena gain importance and new technologies like the computational Grid, graphical and heterogeneous multi-core processors have to be used to achieve high-performance. The amount of additional complexity can not be handled by small groups of specialised scientists. The interdiciplinary nature of scientiïŹc software thereby presents new challanges for software engineering. A paradigm shift towards a stronger separation of concerns becomes necessary in the development of future scientiïŹc software. The coupling of independently simulated physical phenomena is an important example for a software-engineering concern in the domain of computational science. In this context, different simulation-programs model only a part of a more complex coupled system. The present work gives overview on paradigms which aim at making software-development in computational sciences more reliable and less interdependent. A special focus is put on the development of coupled simulations

    Implementation and Evaluation of Algorithmic Skeletons: Parallelisation of Computer Algebra Algorithms

    This thesis presents design and implementation approaches for the parallel algorithms of computer algebra. We use algorithmic skeletons and also further approaches, like data parallel arithmetic and actors. We have implemented skeletons for divide and conquer algorithms and some special parallel loops, that we call ‘repeated computation with a possibility of premature termination’. We introduce in this thesis a rational data parallel arithmetic. We focus on parallel symbolic computation algorithms, for these algorithms our arithmetic provides a generic parallelisation approach. The implementation is carried out in Eden, a parallel functional programming language based on Haskell. This choice enables us to encode both the skeletons and the programs in the same language. Moreover, it allows us to refrain from using two different languages—one for the implementation and one for the interface—for our implementation of computer algebra algorithms. Further, this thesis presents methods for evaluation and estimation of parallel execution times. We partition the parallel execution time into two components. One of them accounts for the quality of the parallelisation, we call it the ‘parallel penalty’. The other is the sequential execution time. For the estimation, we predict both components separately, using statistical methods. This enables very confident estimations, although using drastically less measurement points than other methods. We have applied both our evaluation and estimation approaches to the parallel programs presented in this thesis. We haven also used existing estimation methods. We developed divide and conquer skeletons for the implementation of fast parallel multiplication. We have implemented the Karatsuba algorithm, Strassen’s matrix multiplication algorithm and the fast Fourier transform. The latter was used to implement polynomial convolution that leads to a further fast multiplication algorithm. Specially for our implementation of Strassen algorithm we have designed and implemented a divide and conquer skeleton basing on actors. We have implemented the parallel fast Fourier transform, and not only did we use new divide and conquer skeletons, but also developed a map-and-transpose skeleton. It enables good parallelisation of the Fourier transform. The parallelisation of Karatsuba multiplication shows a very good performance. We have analysed the parallel penalty of our programs and compared it to the serial fraction—an approach, known from literature. We also performed execution time estimations of our divide and conquer programs. This thesis presents a parallel map+reduce skeleton scheme. It allows us to combine the usual parallel map skeletons, like parMap, farm, workpool, with a premature termination property. We use this to implement the so-called ‘parallel repeated computation’, a special form of a speculative parallel loop. We have implemented two probabilistic primality tests: the Rabin–Miller test and the Jacobi sum test. We parallelised both with our approach. We analysed the task distribution and stated the fitting configurations of the Jacobi sum test. We have shown formally that the Jacobi sum test can be implemented in parallel. Subsequently, we parallelised it, analysed the load balancing issues, and produced an optimisation. The latter enabled a good implementation, as verified using the parallel penalty. We have also estimated the performance of the tests for further input sizes and numbers of processing elements. Parallelisation of the Jacobi sum test and our generic parallelisation scheme for the repeated computation is our original contribution. The data parallel arithmetic was defined not only for integers, which is already known, but also for rationals. We handled the common factors of the numerator or denominator of the fraction with the modulus in a novel manner. This is required to obtain a true multiple-residue arithmetic, a novel result of our research. Using these mathematical advances, we have parallelised the determinant computation using the Gauß elimination. As always, we have performed task distribution analysis and estimation of the parallel execution time of our implementation. A similar computation in Maple emphasised the potential of our approach. Data parallel arithmetic enables parallelisation of entire classes of computer algebra algorithms. Summarising, this thesis presents and thoroughly evaluates new and existing design decisions for high-level parallelisations of computer algebra algorithms

    A Control-Theoretic Methodology for Adaptive Structured Parallel Computations

    Adaptivity for distributed parallel applications is an essential feature whose impor- tance has been assessed in many research fields (e.g. scientific computations, large- scale real-time simulation systems and emergency management applications). Especially for high-performance computing, this feature is of special interest in order to properly and promptly respond to time-varying QoS requirements, to react to uncontrollable environ- mental effects influencing the underlying execution platform and to efficiently deal with highly irregular parallel problems. In this scenario the Structured Parallel Programming paradigm is a cornerstone for expressing adaptive parallel programs: the high-degree of composability of parallelization schemes, their QoS predictability formally expressed by performance models, are basic tools in order to introduce dynamic reconfiguration processes of adaptive applications. These reconfigurations are not only limited to imple- mentation aspects (e.g. parallelism degree modifications), but also parallel versions with different structures can be expressed for the same computation, featuring different levels of performance, memory utilization, energy consumption, and exploitation of the memory hierarchies. Over the last decade several programming models and research frameworks have been developed aimed at the definition of tools and strategies for expressing adaptive parallel applications. Notwithstanding this notable research effort, properties like the optimal- ity of the application execution and the stability of control decisions are not sufficiently studied in the existing work. For this reason this thesis exploits a pioneer research in the context of providing formal theoretical tools founded on Control Theory and Game Theory techniques. Based on these approaches, we introduce a formal model for control- ling distributed parallel applications represented by computational graphs of structured parallelism schemes (also called skeleton-based parallelism). Starting out from the performance predictability of structured parallelism schemes, in this thesis we provide a formalization of the concept of adaptive parallel module per- forming structured parallel computations. The module behavior is described in terms of a Hybrid System abstraction and reconfigurations are driven by a Predictive Control ap- proach. Experimental results show the effectiveness of this work, in terms of execution cost reduction as well as the stability degree of a system reconfiguration: i.e. how long a reconfiguration choice is useful for targeting the required QoS levels. This thesis also faces with the issue of controlling large-scale distributed applications composed of several interacting adaptive components. After a panoramic view of the existing control-theoretic approaches (e.g. based on decentralized, distributed or hierar- chical structures of controllers), we introduce a methodology for the distributed predictive control. For controlling computational graphs, the overall control problem consists in a set of coupled control sub-problems for each application module. The decomposition is- sue has a twofold nature: first of all we need to model the coupling relationships between control sub-problems, furthermore we need to introduce proper notions of negotiation and convergence in the control decisions collectively taken by the parallel modules of the application graph. This thesis provides a formalization through basic concepts of Non-cooperative Games and Cooperative Optimization. In the notable context of the dis- tributed control of performance and resource utilization, we exploit a formal description of the control problem providing results for equilibrium point existence and the compari- son of the control optimality with different adaptation strategies and interaction protocols. Discussions and a first validation of the proposed techniques are exploited through exper- iments performed in a simulation environment