869 research outputs found

    Pipelining the Fast Multipole Method over a Runtime System

    Get PDF
    Fast Multipole Methods (FMM) are a fundamental operation for the simulation of many physical problems. The high performance design of such methods usually requires to carefully tune the algorithm for both the targeted physics and the hardware. In this paper, we propose a new approach that achieves high performance across architectures. Our method consists of expressing the FMM algorithm as a task flow and employing a state-of-the-art runtime system, StarPU, in order to process the tasks on the different processing units. We carefully design the task flow, the mathematical operators, their Central Processing Unit (CPU) and Graphics Processing Unit (GPU) implementations, as well as scheduling schemes. We compute potentials and forces of 200 million particles in 48.7 seconds on a homogeneous 160 cores SGI Altix UV 100 and of 38 million particles in 13.34 seconds on a heterogeneous 12 cores Intel Nehalem processor enhanced with 3 Nvidia M2090 Fermi GPUs.Comment: No. RR-7981 (2012

    A Tuned and Scalable Fast Multipole Method as a Preeminent Algorithm for Exascale Systems

    Full text link
    Among the algorithms that are likely to play a major role in future exascale computing, the fast multipole method (FMM) appears as a rising star. Our previous recent work showed scaling of an FMM on GPU clusters, with problem sizes in the order of billions of unknowns. That work led to an extremely parallel FMM, scaling to thousands of GPUs or tens of thousands of CPUs. This paper reports on a a campaign of performance tuning and scalability studies using multi-core CPUs, on the Kraken supercomputer. All kernels in the FMM were parallelized using OpenMP, and a test using 10^7 particles randomly distributed in a cube showed 78% efficiency on 8 threads. Tuning of the particle-to-particle kernel using SIMD instructions resulted in 4x speed-up of the overall algorithm on single-core tests with 10^3 - 10^7 particles. Parallel scalability was studied in both strong and weak scaling. The strong scaling test used 10^8 particles and resulted in 93% parallel efficiency on 2048 processes for the non-SIMD code and 54% for the SIMD-optimized code (which was still 2x faster). The weak scaling test used 10^6 particles per process, and resulted in 72% efficiency on 32,768 processes, with the largest calculation taking about 40 seconds to evaluate more than 32 billion unknowns. This work builds up evidence for our view that FMM is poised to play a leading role in exascale computing, and we end the paper with a discussion of the features that make it a particularly favorable algorithm for the emerging heterogeneous and massively parallel architectural landscape

    SynergyGrids: blockchain-supported distributed microgrid energy trading

    Get PDF
    Growing intelligent cities is witnessing an increasing amount of local energy generation through renewable energy resources. Energy trade among the local energy generators (aka prosumers) and consumers can reduce the energy consumption cost and also reduce the dependency on conventional energy resources, not to mention the environmental, economic, and societal benefits. However, these local energy sources might not be enough to fulfill energy consumption demands. A hybrid approach, where consumers can buy energy from both prosumers (that generate energy) and also from prosumer of other locations, is essential. A centralized system can be used to manage this energy trading that faces several security issues and increase centralized development cost. In this paper, a hybrid energy trading system coupled with a smart contract named SynergyGrids has been proposed as a solution, that reduces the average cost of energy and load over the utility grids. To the best of our knowledge, this work is the first attempt to create a hybrid energy trading platform over the smart contract for energy demand prediction. An hourly energy data set has been utilized for testing and validation purposes. The trading system shows 17.8% decrease in energy cost for consumers and 76.4% decrease in load over utility grids when compared with its counterparts

    Universal Metering Device for LTE Mobile Networks

    Get PDF
    Predmetom diplomovej práce je skúmanie možností realizácie univerzálneho meracieho zariadenia v sieti LTE. Sú popísané základy M2M komunikácie a trendy v IoT. Cieľom práce je nájsť optimálny hardvérový základ, na ktorom následne bude možné konfigurovať LTE modem a získať údaje o stave a kvalite pripojenia k sieti. Následne bude vykonaná séria testov za účelom zmerania parametrov siete a výsledky budú spracované k prezentácii na webovom serveri vo forme grafov.In the diploma thesis, I’m dealing with the realization of the universal measuring device in the LTE network, describing the basics of the M2M communication and the trends in the IoT. I’m searching for the optimal hardware board, on which is going to be the LTE modem configured and the information about the state and the quality of the connection to the network gathered. Then, a set of tests to measure the parameters of the network is triggered and results are processed to be displayed on the web server as set of graphs.

    Reliable machine-to-machine multicast services with multi-radio cooperative retransmissions

    Get PDF
    The final publication is available at Springer via http://dx.doi.org/10.1007/s11036-015-0575-6The 3GPP is working towards the definition of service requirements and technical solutions to provide support for energy-efficient Machine Type Communications (MTC) in the forthcoming generations of cellular networks. One of the envisioned solutions consists in applying group management policies to clusters of devices in order to reduce control signaling and improve upon energy efficiency, e.g., multicast Over-The-Air (OTA) firmware updates. In this paper, a Multi-Radio Cooperative Retransmission Scheme is proposed to efficiently carry out multicast transmissions in MTC networks, reducing both control signaling and improving energy-efficiency. The proposal can be executed in networks composed by devices equipped with multiple radio interfaces which enable them to connect to both a cellular access network, e.g., LTE, and a short-range MTC area network, e.g., Low-Power Wi-Fi or ZigBee, as foreseen by the MTC architecture defined by ETSI. The main idea is to carry out retransmissions over the M2M area network upon error in the main cellular link. This yields a reduction in both the traffic load over the cellular link and the energy consumption of the devices. Computer-based simulations with ns-3 have been conducted to analyze the performance of the proposed scheme in terms of energy consumption and assess its superior performance compared to non-cooperative retransmission schemes, thus validating its suitability for energy-constrained MTC applications.Peer ReviewedPostprint (author's final draft

    Trusted, Decentralized and Blockchain-Based M2M Application Service Provision

    Get PDF
    Decentralized M2M service platforms enable the integration of end-user-based M2M applications and end-user-located M2M resources without the use of central entities or components in the system architecture. Sharing end-user-based M2M applications with other users’ part of an M2M community allows the creation of new and complex M2M applications. However, a fully decentralized system often leads to several trust issues regarding the behavior of end-users and M2M applications. A powerful measure to overcome possible limitations of decentralized M2M service platforms and to replace the missing control authority are trust relationships among the nodes. Therefore, this publication proposes a novel concept for trusted M2M application service provision. Moreover, it introduces the integration of blockchain elements and trust evaluation techniques to optimize the M2M application service provision. A trust consensus protocol is integrated in order to secure the decision-making process among the stakeholders which optimizes several aspects, such as peer joining, service registration and application configuration
    • …
    corecore