869 research outputs found
Pipelining the Fast Multipole Method over a Runtime System
Fast Multipole Methods (FMM) are a fundamental operation for the simulation
of many physical problems. The high performance design of such methods usually
requires to carefully tune the algorithm for both the targeted physics and the
hardware. In this paper, we propose a new approach that achieves high
performance across architectures. Our method consists of expressing the FMM
algorithm as a task flow and employing a state-of-the-art runtime system,
StarPU, in order to process the tasks on the different processing units. We
carefully design the task flow, the mathematical operators, their Central
Processing Unit (CPU) and Graphics Processing Unit (GPU) implementations, as
well as scheduling schemes. We compute potentials and forces of 200 million
particles in 48.7 seconds on a homogeneous 160 cores SGI Altix UV 100 and of 38
million particles in 13.34 seconds on a heterogeneous 12 cores Intel Nehalem
processor enhanced with 3 Nvidia M2090 Fermi GPUs.Comment: No. RR-7981 (2012
A Tuned and Scalable Fast Multipole Method as a Preeminent Algorithm for Exascale Systems
Among the algorithms that are likely to play a major role in future exascale
computing, the fast multipole method (FMM) appears as a rising star. Our
previous recent work showed scaling of an FMM on GPU clusters, with problem
sizes in the order of billions of unknowns. That work led to an extremely
parallel FMM, scaling to thousands of GPUs or tens of thousands of CPUs. This
paper reports on a a campaign of performance tuning and scalability studies
using multi-core CPUs, on the Kraken supercomputer. All kernels in the FMM were
parallelized using OpenMP, and a test using 10^7 particles randomly distributed
in a cube showed 78% efficiency on 8 threads. Tuning of the
particle-to-particle kernel using SIMD instructions resulted in 4x speed-up of
the overall algorithm on single-core tests with 10^3 - 10^7 particles. Parallel
scalability was studied in both strong and weak scaling. The strong scaling
test used 10^8 particles and resulted in 93% parallel efficiency on 2048
processes for the non-SIMD code and 54% for the SIMD-optimized code (which was
still 2x faster). The weak scaling test used 10^6 particles per process, and
resulted in 72% efficiency on 32,768 processes, with the largest calculation
taking about 40 seconds to evaluate more than 32 billion unknowns. This work
builds up evidence for our view that FMM is poised to play a leading role in
exascale computing, and we end the paper with a discussion of the features that
make it a particularly favorable algorithm for the emerging heterogeneous and
massively parallel architectural landscape
SynergyGrids: blockchain-supported distributed microgrid energy trading
Growing intelligent cities is witnessing an increasing amount of local energy generation through renewable energy resources. Energy trade among the local energy generators (aka prosumers) and consumers can reduce the energy consumption cost and also reduce the dependency on conventional energy resources, not to mention the environmental, economic, and societal benefits. However, these local energy sources might not be enough to fulfill energy consumption demands. A hybrid approach, where consumers can buy energy from both prosumers (that generate energy) and also from prosumer of other locations, is essential. A centralized system can be used to manage this energy trading that faces several security issues and increase centralized development cost. In this paper, a hybrid energy trading system coupled with a smart contract named SynergyGrids has been proposed as a solution, that reduces the average cost of energy and load over the utility grids. To the best of our knowledge, this work is the first attempt to create a hybrid energy trading platform over the smart contract for energy demand prediction. An hourly energy data set has been utilized for testing and validation purposes. The trading system shows 17.8% decrease in energy cost for consumers and 76.4% decrease in load over utility grids when compared with its counterparts
Universal Metering Device for LTE Mobile Networks
Predmetom diplomovej práce je skĂşmanie moĹľnostĂ realizácie univerzálneho meracieho zariadenia v sieti LTE. SĂş popĂsanĂ© základy M2M komunikácie a trendy v IoT. CieÄľom práce je nájsĹĄ optimálny hardvĂ©rovĂ˝ základ, na ktorom následne bude moĹľnĂ© konfigurovaĹĄ LTE modem a zĂskaĹĄ Ăşdaje o stave a kvalite pripojenia k sieti. Následne bude vykonaná sĂ©ria testov za účelom zmerania parametrov siete a vĂ˝sledky budĂş spracovanĂ© k prezentácii na webovom serveri vo forme grafov.In the diploma thesis, I’m dealing with the realization of the universal measuring device in the LTE network, describing the basics of the M2M communication and the trends in the IoT. I’m searching for the optimal hardware board, on which is going to be the LTE modem configured and the information about the state and the quality of the connection to the network gathered. Then, a set of tests to measure the parameters of the network is triggered and results are processed to be displayed on the web server as set of graphs.
Reliable machine-to-machine multicast services with multi-radio cooperative retransmissions
The final publication is available at Springer via http://dx.doi.org/10.1007/s11036-015-0575-6The 3GPP is working towards the definition of service requirements and technical solutions to provide support for energy-efficient Machine Type Communications (MTC) in the forthcoming generations of cellular networks. One of the envisioned solutions consists in applying group management policies to clusters of devices in order to reduce control signaling and improve upon energy efficiency, e.g., multicast Over-The-Air (OTA) firmware updates. In this paper, a Multi-Radio Cooperative Retransmission Scheme is proposed to efficiently carry out multicast transmissions in MTC networks, reducing both control signaling and improving energy-efficiency. The proposal can be executed in networks composed by devices equipped with multiple radio interfaces which enable them to connect to both a cellular access network, e.g., LTE, and a short-range MTC area network, e.g., Low-Power Wi-Fi or ZigBee, as foreseen by the MTC architecture defined by ETSI. The main idea is to carry out retransmissions over the M2M area network upon error in the main cellular link. This yields a reduction in both the traffic load over the cellular link and the energy consumption of the devices. Computer-based simulations with ns-3 have been conducted to analyze the performance of the proposed scheme in terms of energy consumption and assess its superior performance compared to non-cooperative retransmission schemes, thus validating its suitability for energy-constrained MTC applications.Peer ReviewedPostprint (author's final draft
Trusted, Decentralized and Blockchain-Based M2M Application Service Provision
Decentralized M2M service platforms enable the integration of end-user-based M2M applications and end-user-located M2M resources without the use of central entities or components in the system architecture. Sharing end-user-based M2M applications with other users’ part of an M2M community allows the creation of new and complex M2M applications. However, a fully decentralized system often leads to several trust issues regarding the behavior of end-users and M2M applications. A powerful measure to overcome possible limitations of decentralized M2M service platforms and to replace the missing control authority are trust relationships among the nodes. Therefore, this publication proposes a novel concept for trusted M2M application service provision. Moreover, it introduces the integration of blockchain elements and trust evaluation techniques to optimize the M2M application service provision. A trust consensus protocol is integrated in order to secure the decision-making process among the stakeholders which optimizes several aspects, such as peer joining, service registration and application configuration
- …