2,233 research outputs found

    Improving the performance of water distribution systems’ simulation on multicore systems

    Full text link
    The final publication is available at Springer via http://dx.doi.org/10.1007/s11227-015-1607-5Hydraulic solvers for the simulation of flows and pressures in water distribution systems (WDS) are used extensively, and their computational performance is key when considering optimization problems. This paper presents an approach to speedup the hydraulic solver using OpenMP with two efficient methods for WDS simulation. The paper identifies the different tasks carried out in the simulation, showing their contribution to the execution time, and selecting the target tasks for parallelization. After describing the algorithms for the selected tasks, parallel OpenMP versions are derived, with emphasis on the task of linear system update. Results are presented for four different large WDS models, showing considerable reduction in computing timeThis work has been partially supported by Ministerio de Economia y Competitividad from Spain, under the project TEC2012-38142-C04-01, and by project PROMETEO FASE II 2014/003 of Generalitat Valenciana.Alvarruiz Bermejo, F.; Martínez Alzamora, F.; Vidal Maciá, AM. (2016). Improving the performance of water distribution systems’ simulation on multicore systems. Journal of Supercomputing. 1-13. https://doi.org/10.1007/s11227-015-1607-5S113Abraham E, Stoianov I (2015) Efficient preconditioned iterative methods for hydraulic simulation of large scale water distribution networks. Proc Eng 119:623–632Abraham E, Stoianov I (2015) Sparse null space algorithms for hydraulic analysis of large-scale water supply networks. J Hydraul Eng. doi: 10.1061/(ASCE)HY.1943-7900.0001089Alonso JM, Alvarruiz F, Guerrero D et al (2000) Parallel computing in water network analysis and leakage minimization. J Water Resour Plan Manag 126(4):251–260Alvarruiz F, Martínez-Alzamora F, Vidal AM (2015) Efficient simulation of water distribution systems using openmp. In: 15th International conference computational and mathematical methods in computational mathematics, science and engineering, pp 125–129Alvarruiz F, Martínez-Alzamora F, Vidal AM (2015) Improving the efficiency of the loop method for the simulation of water distribution systems. J Water Resour Plan Manag 141(10):04015019Burger G, Sitzenfrei R, Kleidorfer M, Rauch W (2015) Quest for a new solver for EPANET 2. J Water Resour Plan Manag. doi: 10.1061/(ASCE)WR.1943-5452.0000596Creaco E, Franchini M (2014) Comparison of Newton–Raphson global and loop algorithms for water distribution network resolution. J Hydraul Eng 140(3):313–321Creaco E, Franchini M (2015) The identification of loops in water distribution networks. Proc Eng 119:506–515 Computing and Control for the Water Industry (CCWI2015) Sharing the best practice in water managementCrous PA, van Zyl JE, Roodt Y (2012) The potential of graphical processing units to solve hydraulic network equations. J Hydroinf 14:603–612Elhay S, Simpson A, Deuerlein J, Alexander B, Schilders W (2014) Reformulated co-tree flows method competitive with the global gradient algorithm for solving water distribution system equations. J Water Resour Plan Manag 140(12):04014040Epp R, Fowler AG (1970) Efficient code for steady-state flows in networks. J Hydraul Div 96(1):43–56Guidolin M, Burovskiy P, Kapelan Z, Savić D (2010) Cwsnet: an object-oriented toolkit for water distribution system simulations. In: Proceedings of 12th water distribution system analysis symposium, ASCE, Reston, VAGuidolin M, Kapelan Z, Savic D (2013) Using high performance techniques to accelerate demand-driven hydraulic solvers. J Hydroinf 15(1):38–54Guidolin M, Kapelan Z, Savic D, Giustolisi O (2010) High performance hydraulic simulations with epanet on graphics processing units. In: Proceedings of 9th international conference on hydroinformaticsOstfeld A, Uber J, Salomons E et al (2008) The battle of the water sensor networks (BWSN): a design challenge for engineers and algorithms. J Water Resour Plan Manag 134(6):556–568Rossman AL (2000) Epanet 2 users manual. Water Supply and Water Resources Division, US Environment Protection AgencyTodini E, Pilati S (1988) Computer applications in water supply: vol. 1—systems analysis and simulation. In: Coulbeck B, Orr CH (eds) A gradient algorithm for the analysis of pipe networks. Research Studies Press Ltd, Letchworth, Hertfordshire, UK, pp 1–2

    Performance Enhancement of Multicore Architecture

    Get PDF
    Multicore processors integrate several cores on a single chip. The fixed architecture of multicore platforms often fails to accommodate the inherent diverse requirements of different applications. The permanent need to enhance the performance of multicore architecture motivates the development of a dynamic architecture. To address this issue, this paper presents new algorithms for thread selection in fetch stage. Moreover, this paper presents three new fetch stage policies, EACH_LOOP_FETCH, INC-FETCH, and WZ-FETCH, based on Ordinary Least Square (OLS) regression statistic method. These new fetch policies differ on thread selection time which is represented by instructions’ count and window size. Furthermore, the simulation multicore tool, , is adapted to cope with multicore processor dynamic design by adding a dynamic feature in the policy of thread selection in fetch stage. SPLASH2, parallel scientific workloads, has been used to validate the proposed adaptation for multi2sim. Intensive simulated experiments have been conducted and the obtained results show that remarkable performance enhancements have been achieved in terms of execution time and number of instructions per second produces less broadcast operations compared to the typical algorithm

    Graphite: A Distributed Parallel Simulator for Multicores

    Get PDF
    This paper introduces the open-source Graphite distributed parallel multicore simulator infrastructure. Graphite is designed from the ground up for exploration of future multicore processors containing dozens, hundreds, or even thousands of cores. It provides high performance for fast design space exploration and software development for future processors. Several techniques are used to achieve this performance including: direct execution, multi-machine distribution, analytical modeling, and lax synchronization. Graphite is capable of accelerating simulations by leveraging several machines. It can distribute simulation of an off-the-shelf threaded application across a cluster of commodity Linux machines with no modification to the source code. It does this by providing a single, shared address space and consistent single-process image across machines. Graphite is designed to be a simulation framework, allowing different component models to be easily replaced to either model different architectures or tradeoff accuracy for performance. We evaluate Graphite from a number of perspectives and demonstrate that it can simulate target architectures containing over 1000 cores on ten 8-core servers. Performance scales well as more machines are added with near linear speedup in many cases. Simulation slowdown is as low as 41x versus native execution for some applications. The Graphite infrastructure and existing models will be released as open-source software to allow the community to simulate their own architectures and extend and improve the framework

    Wireless Interconnects for Intra-chip & Inter-chip Transmission

    Get PDF
    With the emergence of Internet of Things and information revolution, the demand of high performance computing systems is increasing. The copper interconnects inside the computing chips have evolved into a sophisticated network of interconnects known as Network on Chip (NoC) comprising of routers, switches, repeaters, just like computer networks. When network on chip is implemented on a large scale like in Multicore Multichip (MCMC) systems for High Performance Computing (HPC) systems, length of interconnects increases and so are the problems like power dissipation, interconnect delays, clock synchronization and electrical noise. In this thesis, wireless interconnects are chosen as the substitute for wired copper interconnects. Wireless interconnects offer easy integration with CMOS fabrication and chip packaging. Using wireless interconnects working at unlicensed mm-wave band (57-64GHz), high data rate of Gbps can be achieved. This thesis presents study of transmission between zigzag antennas as wireless interconnects for Multichip multicores (MCMC) systems and 3D IC. For MCMC systems, a four-chips 16-cores model is analyzed with only four wireless interconnects in three configurations with different antenna orientations and locations. Return loss and transmission coefficients are simulated in ANSYS HFSS. Moreover, wireless interconnects are designed, fabricated and tested on a 6’’ silicon wafer with resistivity of 55Ω-cm using a basic standard CMOS process. Wireless interconnect are designed to work at 30GHz using ANSYS HFSS. The fabricated antennas are resonating around 20GHz with a return loss of less than -10dB. The transmission coefficients between antenna pair within a 20mm x 20mm silicon die is found to be varying between -45dB to -55dB. Furthermore, wireless interconnect approach is extended for 3D IC. Wireless interconnects are implemented as zigzag antenna. This thesis extends the work of analyzing the wireless interconnects in 3D IC with different configurations of antenna orientations and coolants. The return loss and transmission coefficients are simulated using ANSYS HFSS

    Modelling fungal colonies and communities:challenges and opportunities

    Get PDF
    This contribution, based on a Special Interest Group session held during IMC9, focuses on physiological based models of filamentous fungal colony growth and interactions. Fungi are known to be an important component of ecosystems, in terms of colony dynamics and interactions within and between trophic levels. We outline some of the essential components necessary to develop a fungal ecology: a mechanistic model of fungal colony growth and interactions, where observed behaviour can be linked to underlying function; a model of how fungi can cooperate at larger scales; and novel techniques for both exploring quantitatively the scales at which fungi operate; and addressing the computational challenges arising from this highly detailed quantification. We also propose a novel application area for fungi which may provide alternate routes for supporting scientific study of colony behaviour. This synthesis offers new potential to explore fungal community dynamics and the impact on ecosystem functioning

    Particle methods parallel implementations by GP-GPU strategies

    Get PDF
    This paper outlines the problems found in the parallelization of SPH (Smoothed Particle Hydrodynamics) algorithms using Graphics Processing Units. Different results of some parallel GPU implementations in terms of the speed-up and the scalability compared to the CPU sequential codes are shown. The most problematic stage in the GPU-SPH algorithms is the one responsible for locating neighboring particles and building the vectors where this information is stored, since these specific algorithms raise many dificulties for a data-level parallelization. Because of the fact that the neighbor location using linked lists does not show enough data-level parallelism, two new approaches have been pro- posed to minimize bank conflicts in the writing and subsequent reading of the neighbor lists. The first strategy proposes an efficient coordination between CPU-GPU, using GPU algorithms for those stages that allow a straight forward parallelization, and sequential CPU algorithms for those instructions that involve some kind of vector reduction. This coordination provides a relatively orderly reading of the neighbor lists in the interactions stage, achieving a speed-up factor of x47 in this stage. However, since the construction of the neighbor lists is quite expensive, it is achieved an overall speed-up of x41. The second strategy seeks to maximize the use of the GPU in the neighbor's location process by executing a specific vector sorting algorithm that allows some data-level parallelism. Al- though this strategy has succeeded in improving the speed-up on the stage of neighboring location, the global speed-up on the interactions stage falls, due to inefficient reading of the neighbor vectors. Some changes to these strategies are proposed, aimed at maximizing the computational load of the GPU and using the GPU texture-units, in order to reach the maximum speed-up for such codes. Different practical applications have been added to the mentioned GPU codes. First, the classical dam-break problem is studied. Second, the wave impact of the sloshing fluid contained in LNG vessel tanks is also simulated as a practical example of particle method
    • …
    corecore