Search CORE

483,808 research outputs found

RDMA vs. RPC for implementing distributed data structures

Author: Brock BA
Buluç A
Chen Y
Owens J
Yan J
Yelick K
Publication venue: eScholarship, University of California
Publication date: 01/01/2019
Field of study

Distributed data structures are key to implementing scalable applications for scientific simulations and data analysis. In this paper we look at two implementation styles for distributed data structures: remote direct memory access (RDMA) and remote procedure call (RPC). We focus on operations that require individual accesses to remote portions of a distributed data structure, e.g., accessing a hash table bucket or distributed queue, rather than global operations in which all processors collectively exchange information. We look at the trade-offs between the two styles through microbenchmarks and a performance model that approximates the cost of each. The RDMA operations have direct hardware support in the network and therefore lower latency and overhead, while the RPC operations are more expressive but higher cost and can suffer from lack of attentiveness from the remote side. We also run experiments to compare the real-world performance of RDMA- and RPC-based data structure operations with the predicted performance to evaluate the accuracy of our model, and show that while the model does not always precisely predict running time, it allows us to choose the best implementation in the examples shown. We believe this analysis will assist developers in designing data structures that will perform well on current network architectures, as well as network architects in providing better support for this class of distributed data structures

arXiv.org e-Print Archive

Crossref

eScholarship - University of California

Distributed energy-efficient power optimization in cellular relay networks with minimum rate constraints

Author: E. Veronica Belmega
Giacomo Bacci
Luca Sanguinetti
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2014
Field of study

In this work, we derive a distributed power control algo- rithm for energy-efficient uplink transmissions in interference- limited cellular networks, equipped with either multiple or shared relays. The proposed solution is derived by model- ing the mobile terminals as utility-driven rational agents that engage in a noncooperative game, under minimum-rate con- straints. The theoretical analysis of the game equilibrium is used to compare the performance of the two different cellular architectures. Extensive simulations show that the shared relay concept outperforms the distributed one in terms of energy efficiency in most network configurations

HAL-CentraleSupelec

Crossref

Archivio della Ricerca - Università di Pisa

HAL-Rennes 1

A scalable parallel finite element framework for growing geometries. Application to metal additive manufacturing

Author: Ayachit U
Burstedde C
Carslaw HS
Cole KD
Ern A
Kaufman L
Kergaßner A
Lindgren LE
Mozaffar M
Schroeder WJ
Wohlers Associates Inc
Publication venue: 'Wiley'
Publication date: 01/01/2019
Field of study

This work introduces an innovative parallel, fully-distributed finite element framework for growing geometries and its application to metal additive manufacturing. It is well-known that virtual part design and qualification in additive manufacturing requires highly-accurate multiscale and multiphysics analyses. Only high performance computing tools are able to handle such complexity in time frames compatible with time-to-market. However, efficiency, without loss of accuracy, has rarely held the centre stage in the numerical community. Here, in contrast, the framework is designed to adequately exploit the resources of high-end distributed-memory machines. It is grounded on three building blocks: (1) Hierarchical adaptive mesh refinement with octree-based meshes; (2) a parallel strategy to model the growth of the geometry; (3) state-of-the-art parallel iterative linear solvers. Computational experiments consider the heat transfer analysis at the part scale of the printing process by powder-bed technologies. After verification against a 3D benchmark, a strong-scaling analysis assesses performance and identifies major sources of parallel overhead. A third numerical example examines the efficiency and robustness of (2) in a curved 3D shape. Unprecedented parallelism and scalability were achieved in this work. Hence, this framework contributes to take on higher complexity and/or accuracy, not only of part-scale simulations of metal or polymer additive manufacturing, but also in welding, sedimentation, atherosclerosis, or any other physical problem where the physical domain of interest grows in time

arXiv.org e-Print Archive

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Crossref

UPCommons. Portal del coneixement obert de la UPC

Scipedia

Distributed probabilistic-data-association-based soft reception employing base station cooperation in MIMO-aided multiuser multicell systems

Author: Hanzo Lajos
Lv Tiejun
Maunder Robert G
Yang Shaoshi
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/09/2011
Field of study

Intercell cochannel interference (CCI) mitigation is investigated in the context of cellular systems relying on dense frequency reuse (FR). A distributed base-station (BS)-cooperation-aided soft reception scheme using the probabilistic data association (PDA) algorithm and soft combining (SC) is proposed for the uplink of multiuser multicell MIMO systems. The realistic 19-cell hexagonal cellular model relying on unity FR is considered, where both the BSs and the mobile stations (MSs) are equipped with multiple antennas. Local-cooperation-based message passing is used, instead of a global message passing chain for the sake of reducing the backhaul traffic. The PDA algorithm is employed as a low-complexity solution for producing soft information, which facilitates the employment of SC at the individual BSs to generate the final soft decision metric. Our simulations and analysis demonstrate that, despite its low additional complexity and backhaul traffic, the proposed distributed PDA-aided SC (DPDA-SC) reception scheme significantly outperforms the conventional noncooperative benchmarkers. Furthermore, since only the index of the possible discrete value of the quantized converged soft information has to be exchanged for SC in practice, the proposed DPDA-SC scheme is relatively robust to the quantization errors of the soft information exchanged. As a beneficial result, the backhaul traffic is dramatically reduced at negligible performance degradation

Southampton (e-Prints Soton)

Crossref