Search CORE

257,446 research outputs found

Implementation of the barotropic vorticity equation on the MPP

Author: Abeles Jim
Suarez Max J.
Publication venue
Publication date
Field of study

A finite difference version of the equations governing two-dimensional, non-divergent flow on a sphere is implemented and integrated on the Massively Parallel Processor (MPP). The MPP's performance is then compared with the Cyber's. The feasibility of using a massively parallel architecture to solve the hydrodynamic equations as they are used in numerical weather prediction (NWP) are described

NASA Technical Reports Server

Benchmark Test of CP-PACS for Lattice QCD

Author: Yoshie T.
Publication venue: 'Japan Society of Applied Physics'
Publication date: 04/10/1995
Field of study

The CP-PACS is a massively parallel computer dedicated for calculations in computational physics and will be in operation in the spring of 1996 at Center for Computational Physics, University of Tsukuba. In this article, we describe the architecture of the CP-PACS and report the results of the estimate of the performance of the CP-PACS for typical lattice QCD calculations.Comment: 12 pages (5 figures), Postscript file, talk presented at "QCD on Massively Parallel Computers" (Yamagata, Japan, March 16-18,1995

arXiv.org e-Print Archive

Crossref

Massively-Parallel Feature Selection for Big Data

Author: Borboudakis Giorgos
Christophides Vassilis
Katsogridakis Pavlos
Pratikakis Polyvios
Tsamardinos Ioannis
Publication venue
Publication date: 23/08/2017
Field of study

We present the Parallel, Forward-Backward with Pruning (PFBP) algorithm for feature selection (FS) in Big Data settings (high dimensionality and/or sample size). To tackle the challenges of Big Data FS PFBP partitions the data matrix both in terms of rows (samples, training examples) as well as columns (features). By employing the concepts of

p

-values of conditional independence tests and meta-analysis techniques PFBP manages to rely only on computations local to a partition while minimizing communication costs. Then, it employs powerful and safe (asymptotically sound) heuristics to make early, approximate decisions, such as Early Dropping of features from consideration in subsequent iterations, Early Stopping of consideration of features within the same iteration, or Early Return of the winner in each iteration. PFBP provides asymptotic guarantees of optimality for data distributions faithfully representable by a causal network (Bayesian network or maximal ancestral graph). Our empirical analysis confirms a super-linear speedup of the algorithm with increasing sample size, linear scalability with respect to the number of features and processing cores, while dominating other competitive algorithms in its class

arXiv.org e-Print Archive

INRIA a CCSD electronic archive server

Hal-Diderot

Massively parallel approximate Gaussian process regression

Author: Gramacy Robert
Niemi Jarad
Niemi Jarad
Weiss Robin
Publication venue
Publication date: 01/01/2014
Field of study

We explore how the big-three computing paradigms -- symmetric multi-processor (SMC), graphical processing units (GPUs), and cluster computing -- can together be brought to bare on large-data Gaussian processes (GP) regression problems via a careful implementation of a newly developed local approximation scheme. Our methodological contribution focuses primarily on GPU computation, as this requires the most care and also provides the largest performance boost. However, in our empirical work we study the relative merits of all three paradigms to determine how best to combine them. The paper concludes with two case studies. One is a real data fluid-dynamics computer experiment which benefits from the local nature of our approximation; the second is a synthetic data example designed to find the largest design for which (accurate) GP emulation can performed on a commensurate predictive set under an hour.Comment: 24 pages, 6 figures, 1 tabl

arXiv.org e-Print Archive

Digital Repository @ Iowa State University (ISU)

Crossref

BLITZEN: A highly integrated massively parallel machine

Author: Blevins D. W.
Davis E. W.
Heaton R. A.
Reif J. H.
Publication venue
Publication date
Field of study

The architecture and VLSI design of a new massively parallel processing array chip are described. The BLITZEN processing element array chip, which contains 1.1 million transistors, serves as the basis for a highly integrated, miniaturized, high-performance, massively parallel machine that is currently under development. Each processing element has 1K bits of static RAM and performs bit-serial processing with functional elements for arithmetic, logic, and shifting

NASA Technical Reports Server

Live Demonstration: Multiplexing AER Asynchronous Channels over LVDS Links with Flow-Control and Clock- Correction for Scalable Neuromorphic Systems

Author: Furber Steve B.
Iakymchuk T.
Jablonski M.
Linares Barranco Alejandro
Linares Barranco Bernabé
Plana L.A.
Rosado A.
Serrano Gotarredona María Teresa
Yousefzadeh Amirreza
Publication venue: IEEE Computer Society
Publication date: 01/01/2017
Field of study

In this live demonstration we exploit the use of a serial link for fast asynchronous communication in massively parallel processing platforms connected to a DVS for realtime implementation of bio-inspired vision processing on spiking neural networks

idUS. Depósito de Investigación Universidad de Sevilla