14,413 research outputs found
The Simulation Model Partitioning Problem: an Adaptive Solution Based on Self-Clustering (Extended Version)
This paper is about partitioning in parallel and distributed simulation. That
means decomposing the simulation model into a numberof components and to
properly allocate them on the execution units. An adaptive solution based on
self-clustering, that considers both communication reduction and computational
load-balancing, is proposed. The implementation of the proposed mechanism is
tested using a simulation model that is challenging both in terms of structure
and dynamicity. Various configurations of the simulation model and the
execution environment have been considered. The obtained performance results
are analyzed using a reference cost model. The results demonstrate that the
proposed approach is promising and that it can reduce the simulation execution
time in both parallel and distributed architectures
Fast Deterministic Selection
The Median of Medians (also known as BFPRT) algorithm, although a landmark
theoretical achievement, is seldom used in practice because it and its variants
are slower than simple approaches based on sampling. The main contribution of
this paper is a fast linear-time deterministic selection algorithm
QuickselectAdaptive based on a refined definition of MedianOfMedians. The
algorithm's performance brings deterministic selection---along with its
desirable properties of reproducible runs, predictable run times, and immunity
to pathological inputs---in the range of practicality. We demonstrate results
on independent and identically distributed random inputs and on
normally-distributed inputs. Measurements show that QuickselectAdaptive is
faster than state-of-the-art baselines.Comment: Pre-publication draf
Algorithmic and Statistical Perspectives on Large-Scale Data Analysis
In recent years, ideas from statistics and scientific computing have begun to
interact in increasingly sophisticated and fruitful ways with ideas from
computer science and the theory of algorithms to aid in the development of
improved worst-case algorithms that are useful for large-scale scientific and
Internet data analysis problems. In this chapter, I will describe two recent
examples---one having to do with selecting good columns or features from a (DNA
Single Nucleotide Polymorphism) data matrix, and the other having to do with
selecting good clusters or communities from a data graph (representing a social
or information network)---that drew on ideas from both areas and that may serve
as a model for exploiting complementary algorithmic and statistical
perspectives in order to solve applied large-scale data analysis problems.Comment: 33 pages. To appear in Uwe Naumann and Olaf Schenk, editors,
"Combinatorial Scientific Computing," Chapman and Hall/CRC Press, 201
A pruned dynamic programming algorithm to recover the best segmentations with to change-points
A common computational problem in multiple change-point models is to recover
the segmentations with to change-points of minimal cost with
respect to some loss function. Here we present an algorithm to prune the set of
candidate change-points which is based on a functional representation of the
cost of segmentations. We study the worst case complexity of the algorithm when
there is a unidimensional parameter per segment and demonstrate that it is at
worst equivalent to the complexity of the segment neighbourhood algorithm:
. For a particular loss function we demonstrate that
pruning is on average efficient even if there are no change-points in the
signal. Finally, we empirically study the performance of the algorithm in the
case of the quadratic loss and show that it is faster than the segment
neighbourhood algorithm.Comment: 31 pages, An extended version of the pre-prin
- …