257,446 research outputs found

    Implementation of the barotropic vorticity equation on the MPP

    Get PDF
    A finite difference version of the equations governing two-dimensional, non-divergent flow on a sphere is implemented and integrated on the Massively Parallel Processor (MPP). The MPP's performance is then compared with the Cyber's. The feasibility of using a massively parallel architecture to solve the hydrodynamic equations as they are used in numerical weather prediction (NWP) are described

    Benchmark Test of CP-PACS for Lattice QCD

    Full text link
    The CP-PACS is a massively parallel computer dedicated for calculations in computational physics and will be in operation in the spring of 1996 at Center for Computational Physics, University of Tsukuba. In this article, we describe the architecture of the CP-PACS and report the results of the estimate of the performance of the CP-PACS for typical lattice QCD calculations.Comment: 12 pages (5 figures), Postscript file, talk presented at "QCD on Massively Parallel Computers" (Yamagata, Japan, March 16-18,1995

    Massively-Parallel Feature Selection for Big Data

    Full text link
    We present the Parallel, Forward-Backward with Pruning (PFBP) algorithm for feature selection (FS) in Big Data settings (high dimensionality and/or sample size). To tackle the challenges of Big Data FS PFBP partitions the data matrix both in terms of rows (samples, training examples) as well as columns (features). By employing the concepts of pp-values of conditional independence tests and meta-analysis techniques PFBP manages to rely only on computations local to a partition while minimizing communication costs. Then, it employs powerful and safe (asymptotically sound) heuristics to make early, approximate decisions, such as Early Dropping of features from consideration in subsequent iterations, Early Stopping of consideration of features within the same iteration, or Early Return of the winner in each iteration. PFBP provides asymptotic guarantees of optimality for data distributions faithfully representable by a causal network (Bayesian network or maximal ancestral graph). Our empirical analysis confirms a super-linear speedup of the algorithm with increasing sample size, linear scalability with respect to the number of features and processing cores, while dominating other competitive algorithms in its class

    Massively parallel approximate Gaussian process regression

    Get PDF
    We explore how the big-three computing paradigms -- symmetric multi-processor (SMC), graphical processing units (GPUs), and cluster computing -- can together be brought to bare on large-data Gaussian processes (GP) regression problems via a careful implementation of a newly developed local approximation scheme. Our methodological contribution focuses primarily on GPU computation, as this requires the most care and also provides the largest performance boost. However, in our empirical work we study the relative merits of all three paradigms to determine how best to combine them. The paper concludes with two case studies. One is a real data fluid-dynamics computer experiment which benefits from the local nature of our approximation; the second is a synthetic data example designed to find the largest design for which (accurate) GP emulation can performed on a commensurate predictive set under an hour.Comment: 24 pages, 6 figures, 1 tabl

    BLITZEN: A highly integrated massively parallel machine

    Get PDF
    The architecture and VLSI design of a new massively parallel processing array chip are described. The BLITZEN processing element array chip, which contains 1.1 million transistors, serves as the basis for a highly integrated, miniaturized, high-performance, massively parallel machine that is currently under development. Each processing element has 1K bits of static RAM and performs bit-serial processing with functional elements for arithmetic, logic, and shifting

    Live Demonstration: Multiplexing AER Asynchronous Channels over LVDS Links with Flow-Control and Clock- Correction for Scalable Neuromorphic Systems

    Get PDF
    In this live demonstration we exploit the use of a serial link for fast asynchronous communication in massively parallel processing platforms connected to a DVS for realtime implementation of bio-inspired vision processing on spiking neural networks
    corecore