16,498 research outputs found

    Benchmarking SciDB Data Import on HPC Systems

    Full text link
    SciDB is a scalable, computational database management system that uses an array model for data storage. The array data model of SciDB makes it ideally suited for storing and managing large amounts of imaging data. SciDB is designed to support advanced analytics in database, thus reducing the need for extracting data for analysis. It is designed to be massively parallel and can run on commodity hardware in a high performance computing (HPC) environment. In this paper, we present the performance of SciDB using simulated image data. The Dynamic Distributed Dimensional Data Model (D4M) software is used to implement the benchmark on a cluster running the MIT SuperCloud software stack. A peak performance of 2.2M database inserts per second was achieved on a single node of this system. We also show that SciDB and the D4M toolbox provide more efficient ways to access random sub-volumes of massive datasets compared to the traditional approaches of reading volumetric data from individual files. This work describes the D4M and SciDB tools we developed and presents the initial performance results. This performance was achieved by using parallel inserts, a in-database merging of arrays as well as supercomputing techniques, such as distributed arrays and single-program-multiple-data programming.Comment: 5 pages, 4 figures, IEEE High Performance Extreme Computing (HPEC) 2016, best paper finalis

    On the design and implementation of broadcast and global combine operations using the postal model

    Get PDF
    There are a number of models that were proposed in recent years for message passing parallel systems. Examples are the postal model and its generalization the LogP model. In the postal model a parameter λ is used to model the communication latency of the message-passing system. Each node during each round can send a fixed-size message and, simultaneously, receive a message of the same size. Furthermore, a message sent out during round r will incur a latency of hand will arrive at the receiving node at round r + λ - 1. Our goal in this paper is to bridge the gap between the theoretical modeling and the practical implementation. In particular, we investigate a number of practical issues related to the design and implementation of two collective communication operations, namely, the broadcast operation and the global combine operation. Those practical issues include, for example, 1) techniques for measurement of the value of λ on a given machine, 2) creating efficient broadcast algorithms that get the latency hand the number of nodes n as parameters and 3) creating efficient global combine algorithms for parallel machines with λ which is not an integer. We propose solutions that address those practical issues and present results of an experimental study of the new algorithms on the Intel Delta machine. Our main conclusion is that the postal model can help in performance prediction and tuning, for example, a properly tuned broadcast improves the known implementation by more than 20%

    Massively Parallel Computing at the Large Hadron Collider up to the HL-LHC

    Full text link
    As the Large Hadron Collider (LHC) continues its upward progression in energy and luminosity towards the planned High-Luminosity LHC (HL-LHC) in 2025, the challenges of the experiments in processing increasingly complex events will also continue to increase. Improvements in computing technologies and algorithms will be a key part of the advances necessary to meet this challenge. Parallel computing techniques, especially those using massively parallel computing (MPC), promise to be a significant part of this effort. In these proceedings, we discuss these algorithms in the specific context of a particularly important problem: the reconstruction of charged particle tracks in the trigger algorithms in an experiment, in which high computing performance is critical for executing the track reconstruction in the available time. We discuss some areas where parallel computing has already shown benefits to the LHC experiments, and also demonstrate how a MPC-based trigger at the CMS experiment could not only improve performance, but also extend the reach of the CMS trigger system to capture events which are currently not practical to reconstruct at the trigger level.Comment: 14 pages, 6 figures. Proceedings of 2nd International Summer School on Intelligent Signal Processing for Frontier Research and Industry (INFIERI2014), to appear in JINST. Revised version in response to referee comment

    MPC for MPC: Secure Computation on a Massively Parallel Computing Architecture

    Get PDF
    Massively Parallel Computation (MPC) is a model of computation widely believed to best capture realistic parallel computing architectures such as large-scale MapReduce and Hadoop clusters. Motivated by the fact that many data analytics tasks performed on these platforms involve sensitive user data, we initiate the theoretical exploration of how to leverage MPC architectures to enable efficient, privacy-preserving computation over massive data. Clearly if a computation task does not lend itself to an efficient implementation on MPC even without security, then we cannot hope to compute it efficiently on MPC with security. We show, on the other hand, that any task that can be efficiently computed on MPC can also be securely computed with comparable efficiency. Specifically, we show the following results: - any MPC algorithm can be compiled to a communication-oblivious counterpart while asymptotically preserving its round and space complexity, where communication-obliviousness ensures that any network intermediary observing the communication patterns learn no information about the secret inputs; - assuming the existence of Fully Homomorphic Encryption with a suitable notion of compactness and other standard cryptographic assumptions, any MPC algorithm can be compiled to a secure counterpart that defends against an adversary who controls not only intermediate network routers but additionally up to 1/3 - ? fraction of machines (for an arbitrarily small constant ?) - moreover, this compilation preserves the round complexity tightly, and preserves the space complexity upto a multiplicative security parameter related blowup. As an initial exploration of this important direction, our work suggests new definitions and proposes novel protocols that blend algorithmic and cryptographic techniques
    • …
    corecore