9,004 research outputs found
BriskStream: Scaling Data Stream Processing on Shared-Memory Multicore Architectures
We introduce BriskStream, an in-memory data stream processing system (DSPSs)
specifically designed for modern shared-memory multicore architectures.
BriskStream's key contribution is an execution plan optimization paradigm,
namely RLAS, which takes relative-location (i.e., NUMA distance) of each pair
of producer-consumer operators into consideration. We propose a branch and
bound based approach with three heuristics to resolve the resulting nontrivial
optimization problem. The experimental evaluations demonstrate that BriskStream
yields much higher throughput and better scalability than existing DSPSs on
multi-core architectures when processing different types of workloads.Comment: To appear in SIGMOD'1
PDE-Foam - a probability-density estimation method using self-adapting phase-space binning
Probability Density Estimation (PDE) is a multivariate discrimination
technique based on sampling signal and background densities defined by event
samples from data or Monte-Carlo (MC) simulations in a multi-dimensional phase
space. In this paper, we present a modification of the PDE method that uses a
self-adapting binning method to divide the multi-dimensional phase space in a
finite number of hyper-rectangles (cells). The binning algorithm adjusts the
size and position of a predefined number of cells inside the multi-dimensional
phase space, minimising the variance of the signal and background densities
inside the cells. The implementation of the binning algorithm PDE-Foam is based
on the MC event-generation package Foam. We present performance results for
representative examples (toy models) and discuss the dependence of the obtained
results on the choice of parameters. The new PDE-Foam shows improved
classification capability for small training samples and reduced classification
time compared to the original PDE method based on range searching.Comment: 19 pages, 11 figures; replaced with revised version accepted for
publication in NIM A and corrected typos in description of Fig. 7 and
Efficient Processing of Spatial Joins Using R-Trees
Abstract: In this paper, we show that spatial joins are very suitable to be processed on a parallel hardware platform. The parallel system is equipped with a so-called shared virtual memory which is well-suited for the design and implementation of parallel spatial join algorithms. We start with an algorithm that consists of three phases: task creation, task assignment and parallel task execu-tion. In order to reduce CPU- and I/O-cost, the three phases are processed in a fashion that pre-serves spatial locality. Dynamic load balancing is achieved by splitting tasks into smaller ones and reassigning some of the smaller tasks to idle processors. In an experimental performance compar-ison, we identify the advantages and disadvantages of several variants of our algorithm. The most efficient one shows an almost optimal speed-up under the assumption that the number of disks is sufficiently large. Topics: spatial database systems, parallel database systems
A Parallel Tree-SPH code for Galaxy Formation
We describe a new implementation of a parallel Tree-SPH code with the aim to
simulate Galaxy Formation and Evolution. The code has been parallelized using
SHMEM, a Cray proprietary library to handle communications between the 256
processors of the Silicon Graphics T3E massively parallel supercomputer hosted
by the Cineca Supercomputing Center (Bologna, Italy). The code combines the
Smoothed Particle Hydrodynamics (SPH) method to solve hydro-dynamical equations
with the popular Barnes and Hut (1986) tree-code to perform gravity calculation
with a NlogN scaling, and it is based on the scalar Tree-SPH code developed by
Carraro et al(1998)[MNRAS 297, 1021]. Parallelization is achieved distributing
particles along processors according to a work-load criterion. Benchmarks, in
terms of load-balance and scalability, of the code are analyzed and critically
discussed against the adiabatic collapse of an isothermal gas sphere test using
20,000 particles on 8 processors. The code results balanced at more that 95%
level. Increasing the number of processors, the load-balance slightly worsens.
The deviation from perfect scalability at increasing number of processors is
almost negligible up to 32 processors. Finally we present a simulation of the
formation of an X-ray galaxy cluster in a flat cold dark matter cosmology,
using 200,000 particles and 32 processors, and compare our results with Evrard
(1988) P3M-SPH simulations. Additionaly we have incorporated radiative cooling,
star formation, feed-back from SNae of type II and Ia, stellar winds and UV
flux from massive stars, and an algorithm to follow the chemical enrichment of
the inter-stellar medium. Simulations with some of these ingredients are also
presented.Comment: 19 pages, 14 figures, accepted for publication in MNRA
- …