1,742 research outputs found

    Integrated platform to assess seismic resilience at the community level

    Get PDF
    Due to the increasing frequency of disastrous events, the challenge of creating large-scale simulation models has become of major significance. Indeed, several simulation strategies and methodologies have been recently developed to explore the response of communities to natural disasters. Such models can support decision-makers during emergency operations allowing to create a global view of the emergency identifying consequences. An integrated platform that implements a community hybrid model with real-time simulation capabilities is presented in this paper. The platform's goal is to assess seismic resilience and vulnerability of critical infrastructures (e.g., built environment, power grid, socio-technical network) at the urban level, taking into account their interdependencies. Finally, different seismic scenarios have been applied to a large-scale virtual city model. The platform proved to be effective to analyze the emergency and could be used to implement countermeasures that improve community response and overall resilience

    Computer architecture for efficient algorithmic executions in real-time systems: New technology for avionics systems and advanced space vehicles

    Get PDF
    Improvements and advances in the development of computer architecture now provide innovative technology for the recasting of traditional sequential solutions into high-performance, low-cost, parallel system to increase system performance. Research conducted in development of specialized computer architecture for the algorithmic execution of an avionics system, guidance and control problem in real time is described. A comprehensive treatment of both the hardware and software structures of a customized computer which performs real-time computation of guidance commands with updated estimates of target motion and time-to-go is presented. An optimal, real-time allocation algorithm was developed which maps the algorithmic tasks onto the processing elements. This allocation is based on the critical path analysis. The final stage is the design and development of the hardware structures suitable for the efficient execution of the allocated task graph. The processing element is designed for rapid execution of the allocated tasks. Fault tolerance is a key feature of the overall architecture. Parallel numerical integration techniques, tasks definitions, and allocation algorithms are discussed. The parallel implementation is analytically verified and the experimental results are presented. The design of the data-driven computer architecture, customized for the execution of the particular algorithm, is discussed

    Efficient Parallel Reinforcement Learning Framework using the Reactor Model

    Full text link
    Parallel Reinforcement Learning (RL) frameworks are essential for mapping RL workloads to multiple computational resources, allowing for faster generation of samples, estimation of values, and policy improvement. These computational paradigms require a seamless integration of training, serving, and simulation workloads. Existing frameworks, such as Ray, are not managing this orchestration efficiently, especially in RL tasks that demand intensive input/output and synchronization between actors on a single node. In this study, we have proposed a solution implementing the reactor model, which enforces a set of actors to have a fixed communication pattern. This allows the scheduler to eliminate work needed for synchronization, such as acquiring and releasing locks for each actor or sending and processing coordination-related messages. Our framework, Lingua Franca (LF), a coordination language based on the reactor model, also supports true parallelism in Python and provides a unified interface that allows users to automatically generate dataflow graphs for RL tasks. In comparison to Ray on a single-node multi-core compute platform, LF achieves 1.21x and 11.62x higher simulation throughput in OpenAI Gym and Atari environments, reduces the average training time of synchronized parallel Q-learning by 31.2%, and accelerates multi-agent RL inference by 5.12x.Comment: 10 pages, 11 figure

    Parallélisation massive des algorithmes de branchement

    Get PDF
    Les problèmes d'optimisation et de recherche sont souvent NP-complets et des techniques de force brute doivent généralement être mises en œuvre pour trouver des solutions exactes. Des problèmes tels que le regroupement de gènes en bio-informatique ou la recherche de routes optimales dans les réseaux de distribution peuvent être résolus en temps exponentiel à l'aide de stratégies de branchement récursif. Néanmoins, ces algorithmes deviennent peu pratiques au-delà de certaines tailles d'instances en raison du grand nombre de scénarios à explorer, pour lesquels des techniques de parallélisation sont nécessaires pour améliorer les performances. Dans des travaux antérieurs, des techniques centralisées et décentralisées ont été mises en œuvre afin d'augmenter le parallélisme des algorithmes de branchement tout en essayant de réduire les coûts de communication, qui jouent un rôle important dans les implémentations massivement parallèles en raison des messages passant entre les processus. Ainsi, notre travail consiste à développer une bibliothèque entièrement générique en C++, nommée GemPBA, pour accélérer presque tous les algorithmes de branchement avec une parallélisation massive, ainsi que le développement d'un outil novateur et simpliste d'équilibrage de charge dynamique pour réduire le nombre de messages transmis en envoyant les tâches prioritaires en premier. Notre approche utilise une stratégie hybride centralisée-décentralisée, qui fait appel à un processus central chargé d'attribuer les rôles des travailleurs par des messages de quelques bits, telles que les tâches n'ont pas besoin de passer par un processeur central. De plus, un processeur en fonctionnement génère de nouvelles tâches si et seulement s'il y a des processeurs disponibles pour les recevoir, garantissant ainsi leur transfert, ce qui réduit considérablement les coûts de communication. Nous avons réalisé nos expériences sur le problème de la couverture minimale de sommets, qui a montré des résultats remarquables, étant capable de résoudre même les graphes DIMACS les plus difficiles avec un simple algorithme MVC.Abstract: Optimization and search problems are often NP-complete, and brute-force techniques must typically be implemented to find exact solutions. Problems such as clustering genes in bioinformatics or finding optimal routes in delivery networks can be solved in exponential-time using recursive branching strategies. Nevertheless, these algorithms become impractical above certain instance sizes due to the large number of scenarios that need to be explored, for which parallelization techniques are necessary to improve the performance. In previous works, centralized and decentralized techniques have been implemented aiming to scale up parallelism on branching algorithms whilst attempting to reduce communication overhead, which plays a significant role in massively parallel implementations due to the messages passing across processes. Thus, our work consists of the development of a fully generic library in C++, named GemPBA, to speed up almost any branching algorithms with massive parallelization, along with the development of a novel and simplistic Dynamic Load Balancing tool to reduce the number of passed messages by sending high priority tasks first. Our approach uses a hybrid centralized-decentralized strategy, which makes use of a center process in charge of assigning worker roles by messages of a few bits of size, such that tasks do not need to pass through a center processor. Also, a working processor will spawn new tasks if and only if there are available processors to receive them, thus, guaranteeing its transfer, and thereby the communication overhead is notably decreased. We performed our experiments on the Minimum Vertex Cover problem, which showed remarkable results, being capable of solving even the toughest DIMACS graphs with a simple MVC algorithm

    Representation of distribution networks of ships using graph-theory

    Get PDF
    CETENA S.p.A., SISSA (International School for Advanced Studies) and Lloyd\u2019s Register (Class Society) have recently been involved in a challenge aimed at developing smart algorithms capable to evaluate the effect of different failure modes \u2014 caused by a fire or a flooding\u2014on the systems of passenger ships in order to improve the design of new passenger ships [1]. Considering that a failure may cause serious accidents both to the vessel and human lives, the goal of this project is to evaluate the best reconfiguration of current ship plants after each casualty scenario so as to guarantee the minimal functioning requirements. This implies a continuous cross check activity (design against installation) that follows the whole ship construction process. The urgency of this work is motivated by the necessity to meet the International Maritime Organizations (IMO) Safety Of Life At Sea (Solas) design prescriptions defined in the Safe Return to Port (SRtP) regulations [2]. According to these criteria, a vessel should be able to safely return to port under its own propulsion after an adverse event not exceeding any of the defined casualty thresholds and criteria imposed by the regulations. Thus, the identification of all the possible failure modes and their propagation through the on-board systems has become a task of paramount importance for the proper design of the ship\u2019s systems against failure events. Currently, in accordance with IMO MSC.1/circ.1369 [3], CETENA produces the Operating Manuals that allow the crew to reconfigure the essential systems after a SRtP casualty so as to be able to bring the ship to a port with adequate comfort and safety standards. However, the ship can be operated in a different way from what is planned in the design stage. In these scenarios, the present static Operational Manuals can be a limitation. In order to be effective during emergency operation, Operational Manuals must be dynamic so as to provide interactive information and guidance to crew members about the reconfiguration of the ship and the recovery of her functions based on the systems configuration at the moment of the casualty. The focus of this work is the study of domino effects triggered by fire or flooding casualties in passenger ships in order to provide crew with a tool which speeds up and facilitates the decision-making process when choices have to be made to optimize the ship residual capability after a casualty. The framework of this study may be extended to other types of domino escalation

    EVALUATION OF CLASSICAL INTER-PROCESS COMMUNICATION PROBLEMS IN PARALLEL PROGRAMMING LANGUAGES

    Get PDF
    It is generally believed for the past several years that parallel programming is the future of computing technology due to its incredible speed and vastly superior performance as compared to classic linear programming. However, how sure are we that this is the case? Despite its aforesaid average superiority, usually parallel-program implementations run in single-processor machines, making the parallelism almost virtual. In this case, does parallel programming still remain superior?The purpose of this document is to research and analyze the performance, in both storage and speed, of three parallel-programming language libraries: OpenMP, OpenMPI and PThreads, along with a few other hybrids obtained by combining two of these three libraries. These analyses will be applied to three classical multi-process synchronization problems: Dining Philosophers, Producers-Consumers and Sleeping Barbers

    Exascale Deep Learning for Climate Analytics

    Full text link
    We extract pixel-level masks of extreme weather patterns using variants of Tiramisu and DeepLabv3+ neural networks. We describe improvements to the software frameworks, input pipeline, and the network training algorithms necessary to efficiently scale deep learning on the Piz Daint and Summit systems. The Tiramisu network scales to 5300 P100 GPUs with a sustained throughput of 21.0 PF/s and parallel efficiency of 79.0%. DeepLabv3+ scales up to 27360 V100 GPUs with a sustained throughput of 325.8 PF/s and a parallel efficiency of 90.7% in single precision. By taking advantage of the FP16 Tensor Cores, a half-precision version of the DeepLabv3+ network achieves a peak and sustained throughput of 1.13 EF/s and 999.0 PF/s respectively.Comment: 12 pages, 5 tables, 4, figures, Super Computing Conference November 11-16, 2018, Dallas, TX, US
    • …
    corecore