227,647 research outputs found

    Application and support for high-performance simulation

    Get PDF
    types: Editorial CommentHigh performance simulation that supports sophisticated simulation experimentation and optimization can require non-trivial amounts of computing power. Advanced distributed computing techniques and systems found in areas such as High Performance Computing (HPC), High Throughput Computing (HTC), grid computing, cloud computing and e-Infrastructures are needed to provide effectively the computing power needed for the high performance simulation of large and complex models. In simulation there has been a long tradition of translating and adopting advances in distributed computing as shown by contributions from the parallel and distributed simulation community. This special issue brings together a contemporary collection of work showcasing original research in the advancement of simulation theory and practice with distributed computing. This special issue is divided into two parts. This first part focuses on research pertaining to high performance simulation that support a range of applications including the study of epidemics, social networks, urban mobility and real-time embedded and cyber-physical systems. Compared to other simulation techniques agent-based modeling and simulation is relatively new; however, it is increasingly being used to study large-scale problems. Agent-based simulations present challenges for high performance simulation as they can be complex and computationally demanding, and it is therefore not surprising that this special issue includes several articles on the high performance simulation of such systems.Research Councils U

    Master/worker parallel discrete event simulation

    Get PDF
    The execution of parallel discrete event simulation across metacomputing infrastructures is examined. A master/worker architecture for parallel discrete event simulation is proposed providing robust executions under a dynamic set of services with system-level support for fault tolerance, semi-automated client-directed load balancing, portability across heterogeneous machines, and the ability to run codes on idle or time-sharing clients without significant interaction by users. Research questions and challenges associated with issues and limitations with the work distribution paradigm, targeted computational domain, performance metrics, and the intended class of applications to be used in this context are analyzed and discussed. A portable web services approach to master/worker parallel discrete event simulation is proposed and evaluated with subsequent optimizations to increase the efficiency of large-scale simulation execution through distributed master service design and intrinsic overhead reduction. New techniques for addressing challenges associated with optimistic parallel discrete event simulation across metacomputing such as rollbacks and message unsending with an inherently different computation paradigm utilizing master services and time windows are proposed and examined. Results indicate that a master/worker approach utilizing loosely coupled resources is a viable means for high throughput parallel discrete event simulation by enhancing existing computational capacity or providing alternate execution capability for less time-critical codes.Ph.D.Committee Chair: Fujimoto, Richard; Committee Member: Bader, David; Committee Member: Perumalla, Kalyan; Committee Member: Riley, George; Committee Member: Vuduc, Richar

    Topologically Consistent Models for Efficient Big Geo-Spatio-Temporal Data Distribution

    Get PDF
    Geo-spatio-temporal topology models are likely to become a key concept to check the consistency of 3D (spatial space) and 4D (spatial + temporal space) models for emerging GIS applications such as subsurface reservoir modelling or the simulation of energy and water supply of mega or smart cities. Furthermore, the data management for complex models consisting of big geo-spatial data is a challenge for GIS and geo-database research. General challenges, concepts, and techniques of big geo-spatial data management are presented. In this paper we introduce a sound mathematical approach for a topologically consistent geo-spatio-temporal model based on the concept of the incidence graph. We redesign DB4GeO, our service-based geo-spatio-temporal database architecture, on the way to the parallel management of massive geo-spatial data. Approaches for a new geo-spatio-temporal and object model of DB4GeO meeting the requirements of big geo-spatial data are discussed in detail. Finally, a conclusion and outlook on our future research are given on the way to support the processing of geo-analytics and -simulations in a parallel and distributed system environment

    Examining the Reproducibility of Using Dynamic Loop Scheduling Techniques in Scientific Applications

    Get PDF
    Reproducibility of the execution of scientific applications on parallel and distributed systems is a growing concern, underlying the trustworthiness of the experiments and the conclusions derived from experiments. Dynamic loop scheduling (DLS) techniques are an effective approach towards performance improvement of scientific applications via load balancing. These techniques address algorithmic and systemic sources of load imbalance by dynamically assigning tasks to processing elements. The DLS techniques have demonstrated their effectiveness when applied in real applications. Complementing native experiments, simulation is a powerful tool for studying the behavior of parallel and distributed applications. In earlier work, the scalability [1], robustness [2], and resilience [3] of the DLS techniques were investigated using the MSG interface of the SimGrid simulation framework [4]. The present work complements the earlier work and concentrates on the verification via reproducibility of the implementation of the DLS techniques in SimGrid-MSG. This work describes the challenges of verifying the performance of using DLS techniques in earlier implementations of scientific applications. The verification is performed via reproducibility of simulations based on SimGrid-MSG. To simulate experiments selected from earlier literature, the reproduction process begins by extracting the information needed from the earlier literature and converting it into the input required by SimGrid-MSG. The reproducibility study is carried out by comparing the performance of SimGrid-MSG-based experiments with those reported in two selected publications in which the DLS techniques were originally proposed. While the reproduction was not successful for experiments from one of the selected publications, it was successful for experiments from the other. This successful reproduction implies the verification of the DLS implementation in SimGrid-MSG for the considered applications and systems, and thus, it allows well-founded future research on the DLS techniques

    Parallel and Distributed Simulation from Many Cores to the Public Cloud (Extended Version)

    Full text link
    In this tutorial paper, we will firstly review some basic simulation concepts and then introduce the parallel and distributed simulation techniques in view of some new challenges of today and tomorrow. More in particular, in the last years there has been a wide diffusion of many cores architectures and we can expect this trend to continue. On the other hand, the success of cloud computing is strongly promoting the everything as a service paradigm. Is parallel and distributed simulation ready for these new challenges? The current approaches present many limitations in terms of usability and adaptivity: there is a strong need for new evaluation metrics and for revising the currently implemented mechanisms. In the last part of the paper, we propose a new approach based on multi-agent systems for the simulation of complex systems. It is possible to implement advanced techniques such as the migration of simulated entities in order to build mechanisms that are both adaptive and very easy to use. Adaptive mechanisms are able to significantly reduce the communication cost in the parallel/distributed architectures, to implement load-balance techniques and to cope with execution environments that are both variable and dynamic. Finally, such mechanisms will be used to build simulations on top of unreliable cloud services.Comment: Tutorial paper published in the Proceedings of the International Conference on High Performance Computing and Simulation (HPCS 2011). Istanbul (Turkey), IEEE, July 2011. ISBN 978-1-61284-382-

    Decentralized load balancing in heterogeneous computational grids

    Get PDF
    With the rapid development of high-speed wide-area networks and powerful yet low-cost computational resources, grid computing has emerged as an attractive computing paradigm. The space limitations of conventional distributed systems can thus be overcome, to fully exploit the resources of under-utilised computing resources in every region around the world for distributed jobs. Workload and resource management are key grid services at the service level of grid software infrastructure, where issues of load balancing represent a common concern for most grid infrastructure developers. Although these are established research areas in parallel and distributed computing, grid computing environments present a number of new challenges, including large-scale computing resources, heterogeneous computing power, the autonomy of organisations hosting the resources, uneven job-arrival pattern among grid sites, considerable job transfer costs, and considerable communication overhead involved in capturing the load information of sites. This dissertation focuses on designing solutions for load balancing in computational grids that can cater for the unique characteristics of grid computing environments. To explore the solution space, we conducted a survey for load balancing solutions, which enabled discussion and comparison of existing approaches, and the delimiting and exploration of the apportion of solution space. A system model was developed to study the load-balancing problems in computational grid environments. In particular, we developed three decentralised algorithms for job dispatching and load balancing—using only partial information: the desirability-aware load balancing algorithm (DA), the performance-driven desirability-aware load-balancing algorithm (P-DA), and the performance-driven region-based load-balancing algorithm (P-RB). All three are scalable, dynamic, decentralised and sender-initiated. We conducted extensive simulation studies to analyse the performance of our load-balancing algorithms. Simulation results showed that the algorithms significantly outperform preexisting decentralised algorithms that are relevant to this research

    Research and Education in Computational Science and Engineering

    Get PDF
    Over the past two decades the field of computational science and engineering (CSE) has penetrated both basic and applied research in academia, industry, and laboratories to advance discovery, optimize systems, support decision-makers, and educate the scientific and engineering workforce. Informed by centuries of theory and experiment, CSE performs computational experiments to answer questions that neither theory nor experiment alone is equipped to answer. CSE provides scientists and engineers of all persuasions with algorithmic inventions and software systems that transcend disciplines and scales. Carried on a wave of digital technology, CSE brings the power of parallelism to bear on troves of data. Mathematics-based advanced computing has become a prevalent means of discovery and innovation in essentially all areas of science, engineering, technology, and society; and the CSE community is at the core of this transformation. However, a combination of disruptive developments---including the architectural complexity of extreme-scale computing, the data revolution that engulfs the planet, and the specialization required to follow the applications to new frontiers---is redefining the scope and reach of the CSE endeavor. This report describes the rapid expansion of CSE and the challenges to sustaining its bold advances. The report also presents strategies and directions for CSE research and education for the next decade.Comment: Major revision, to appear in SIAM Revie

    RELEASE: A High-level Paradigm for Reliable Large-scale Server Software

    Get PDF
    Erlang is a functional language with a much-emulated model for building reliable distributed systems. This paper outlines the RELEASE project, and describes the progress in the rst six months. The project aim is to scale the Erlang's radical concurrency-oriented programming paradigm to build reliable general-purpose software, such as server-based systems, on massively parallel machines. Currently Erlang has inherently scalable computation and reliability models, but in practice scalability is constrained by aspects of the language and virtual machine. We are working at three levels to address these challenges: evolving the Erlang virtual machine so that it can work effectively on large scale multicore systems; evolving the language to Scalable Distributed (SD) Erlang; developing a scalable Erlang infrastructure to integrate multiple, heterogeneous clusters. We are also developing state of the art tools that allow programmers to understand the behaviour of massively parallel SD Erlang programs. We will demonstrate the e ectiveness of the RELEASE approach using demonstrators and two large case studies on a Blue Gene
    • …
    corecore