8,917 research outputs found
Parallelization of Modular Algorithms
In this paper we investigate the parallelization of two modular algorithms.
In fact, we consider the modular computation of Gr\"obner bases (resp. standard
bases) and the modular computation of the associated primes of a
zero-dimensional ideal and describe their parallel implementation in SINGULAR.
Our modular algorithms to solve problems over Q mainly consist of three parts,
solving the problem modulo p for several primes p, lifting the result to Q by
applying Chinese remainder resp. rational reconstruction, and a part of
verification. Arnold proved using the Hilbert function that the verification
part in the modular algorithm to compute Gr\"obner bases can be simplified for
homogeneous ideals (cf. \cite{A03}). The idea of the proof could easily be
adapted to the local case, i.e. for local orderings and not necessarily
homogeneous ideals, using the Hilbert-Samuel function (cf. \cite{Pf07}). In
this paper we prove the corresponding theorem for non-homogeneous ideals in
case of a global ordering.Comment: 16 page
Scalable software architecture for on-line multi-camera video processing
In this paper we present a scalable software architecture for on-line multi-camera video processing, that guarantees a good trade off between computational power, scalability and flexibility. The software system is modular and its main blocks are the Processing Units (PUs), and the Central Unit. The Central Unit works as a supervisor of the running PUs and each PU manages the acquisition phase and the processing phase. Furthermore, an approach to easily parallelize the desired processing application has been presented. In this paper, as case study, we apply the proposed software architecture to a multi-camera system in order to efficiently manage multiple 2D object detection modules in a real-time scenario. System performance has been evaluated under different load conditions such as number of cameras and image sizes. The results show that the software architecture scales well with the number of camera and can easily works with different image formats respecting the real time constraints. Moreover, the parallelization approach can be used in order to speed up the processing tasks with a low level of overhea
Exact Sparse Matrix-Vector Multiplication on GPU's and Multicore Architectures
We propose different implementations of the sparse matrix--dense vector
multiplication (\spmv{}) for finite fields and rings \Zb/m\Zb. We take
advantage of graphic card processors (GPU) and multi-core architectures. Our
aim is to improve the speed of \spmv{} in the \linbox library, and henceforth
the speed of its black box algorithms. Besides, we use this and a new
parallelization of the sigma-basis algorithm in a parallel block Wiedemann rank
implementation over finite fields
Space--Time Tradeoffs for Subset Sum: An Improved Worst Case Algorithm
The technique of Schroeppel and Shamir (SICOMP, 1981) has long been the most
efficient way to trade space against time for the SUBSET SUM problem. In the
random-instance setting, however, improved tradeoffs exist. In particular, the
recently discovered dissection method of Dinur et al. (CRYPTO 2012) yields a
significantly improved space--time tradeoff curve for instances with strong
randomness properties. Our main result is that these strong randomness
assumptions can be removed, obtaining the same space--time tradeoffs in the
worst case. We also show that for small space usage the dissection algorithm
can be almost fully parallelized. Our strategy for dealing with arbitrary
instances is to instead inject the randomness into the dissection process
itself by working over a carefully selected but random composite modulus, and
to introduce explicit space--time controls into the algorithm by means of a
"bailout mechanism"
REBOUND: An open-source multi-purpose N-body code for collisional dynamics
REBOUND is a new multi-purpose N-body code which is freely available under an
open-source license. It was designed for collisional dynamics such as planetary
rings but can also solve the classical N-body problem. It is highly modular and
can be customized easily to work on a wide variety of different problems in
astrophysics and beyond.
REBOUND comes with three symplectic integrators: leap-frog, the symplectic
epicycle integrator (SEI) and a Wisdom-Holman mapping (WH). It supports open,
periodic and shearing-sheet boundary conditions. REBOUND can use a Barnes-Hut
tree to calculate both self-gravity and collisions. These modules are fully
parallelized with MPI as well as OpenMP. The former makes use of a static
domain decomposition and a distributed essential tree. Two new collision
detection modules based on a plane-sweep algorithm are also implemented. The
performance of the plane-sweep algorithm is superior to a tree code for
simulations in which one dimension is much longer than the other two and in
simulations which are quasi-two dimensional with less than one million
particles.
In this work, we discuss the different algorithms implemented in REBOUND, the
philosophy behind the code's structure as well as implementation specific
details of the different modules. We present results of accuracy and scaling
tests which show that the code can run efficiently on both desktop machines and
large computing clusters.Comment: 10 pages, 9 figures, accepted by A&A, source code available at
https://github.com/hannorein/reboun
- …