Search CORE

29,037 research outputs found

Self-improving Algorithms for Coordinate-wise Maxima

Author: Clarkson Kenneth L.
Mulzer Wolfgang
Seshadhri C.
Publication venue: 'Society for Industrial & Applied Mathematics (SIAM)'
Publication date: 01/01/2012
Field of study

Computing the coordinate-wise maxima of a planar point set is a classic and well-studied problem in computational geometry. We give an algorithm for this problem in the \emph{self-improving setting}. We have

n

(unknown) independent distributions \cD_1, \cD_2, ..., \cD_n of planar points. An input pointset

(p_1, p_2, ..., p_n)

is generated by taking an independent sample

p_i

from each \cD_i, so the input distribution \cD is the product \prod_i \cD_i. A self-improving algorithm repeatedly gets input sets from the distribution \cD (which is \emph{a priori} unknown) and tries to optimize its running time for \cD. Our algorithm uses the first few inputs to learn salient features of the distribution, and then becomes an optimal algorithm for distribution \cD. Let \OPT_\cD denote the expected depth of an \emph{optimal} linear comparison tree computing the maxima for distribution \cD. Our algorithm eventually has an expected running time of O(\text{OPT}_\cD + n), even though it did not know \cD to begin with. Our result requires new tools to understand linear comparison trees for computing maxima. We show how to convert general linear comparison trees to very restricted versions, which can then be related to the running time of our algorithm. An interesting feature of our algorithm is an interleaved search, where the algorithm tries to determine the likeliest point to be maximal with minimal computation. This allows the running time to be truly optimal for the distribution \cD.Comment: To appear in Symposium of Computational Geometry 2012 (17 pages, 2 figures

arXiv.org e-Print Archive

CiteSeerX

Crossref

Recommended from our members

RGFGA: An efficient representation and crossover for grouping genetic algorithms

Author: Crampton J
Swift S
Tucker A
Publication venue: 'MIT Press - Journals'
Publication date: 01/01/2005
Field of study

There is substantial research into genetic algorithms that are used to group large numbers of objects into mutually exclusive subsets based upon some fitness function. However, nearly all methods involve degeneracy to some degree. We introduce a new representation for grouping genetic algorithms, the restricted growth function genetic algorithm, that effectively removes all degeneracy, resulting in a more efficient search. A new crossover operator is also described that exploits a measure of similarity between chromosomes in a population. Using several synthetic datasets, we compare the performance of our representation and crossover with another well known state-of-the-art GA method, a strawman optimisation method and a well-established statistical clustering algorithm, with encouraging results

Brunel University Research Archive

RRR: Rank-Regret Representative

Author: Asudeh Abolfazl
Das Gautam
Jagadish H. V.
Nazi Azade
Zhang Nan
Publication venue
Publication date: 01/01/2018
Field of study

Selecting the best items in a dataset is a common task in data exploration. However, the concept of "best" lies in the eyes of the beholder: different users may consider different attributes more important, and hence arrive at different rankings. Nevertheless, one can remove "dominated" items and create a "representative" subset of the data set, comprising the "best items" in it. A Pareto-optimal representative is guaranteed to contain the best item of each possible ranking, but it can be almost as big as the full data. Representative can be found if we relax the requirement to include the best item for every possible user, and instead just limit the users' "regret". Existing work defines regret as the loss in score by limiting consideration to the representative instead of the full data set, for any chosen ranking function. However, the score is often not a meaningful number and users may not understand its absolute value. Sometimes small ranges in score can include large fractions of the data set. In contrast, users do understand the notion of rank ordering. Therefore, alternatively, we consider the position of the items in the ranked list for defining the regret and propose the {\em rank-regret representative} as the minimal subset of the data containing at least one of the top-

k

of any possible ranking function. This problem is NP-complete. We use the geometric interpretation of items to bound their ranks on ranges of functions and to utilize combinatorial geometry notions for developing effective and efficient approximation algorithms for the problem. Experiments on real datasets demonstrate that we can efficiently find small subsets with small rank-regrets

arXiv.org e-Print Archive

University of Illinois at Chicago: UIC INDIGO (INtellectual property in DIGital form available online in an Open environment)

Phase-space structures II: Hierarchical Structure Finder

Author: Alard
Arad
Ascasibar
Aubert
Bertschinger
Binney
Binney
C. Alard
Cole
Coles
Colombi
Davis
Diemand
Eisenstein
F. R. Bouchet
Gelb
Gilmore
Governato
Hinshaw
Kim
Klypin
Lacey
Lukić
M. Maciejewski
Maciejewski
Massey
Mohayaee
Natarajan
Navarro
Neyrinck
S. Colombi
Sharma
Springel
Springel
Springel
Springel
Stadel
V. Springel
Van Albada
Van Waerbeke
Vass
Vogelsberger
White
White
Zwicky
Publication venue: 'Wiley'
Publication date: 01/12/2008
Field of study

A new multi-dimensional Hierarchical Structure Finder (HSF) to study the phase-space structure of dark matter in N-body cosmological simulations is presented. The algorithm depends mainly on two parameters, which control the level of connectivity of the detected structures and their significance compared to Poisson noise. By working in 6D phase-space, where contrasts are much more pronounced than in 3D position space, our HSF algorithm is capable of detecting subhaloes including their tidal tails, and can recognise other phase-space structures such as pure streams and candidate caustics. If an additional unbinding criterion is added, the algorithm can be used as a self-consistent halo and subhalo finder. As a test, we apply it to a large halo of the Millennium Simulation, where 19 % of the halo mass are found to belong to bound substructures, which is more than what is detected with conventional 3D substructure finders, and an additional 23-36 % of the total mass belongs to unbound HSF structures. The distribution of identified phase-space density peaks is clearly bimodal: high peaks are dominated by the bound structures and low peaks belong mostly to tidal streams. In order to better understand what HSF provides, we examine the time evolution of structures, based on the merger tree history. Bound structures typically make only up to 6 orbits inside the main halo. Still, HSF can identify at the present time at least 80 % of the original content of structures with a redshift of infall as high as z <= 0.3, which illustrates the significant power of this tool to perform dynamical analyses in phase-space.Comment: Submitted to MNRAS, 24 pages, 18 figure

arXiv.org e-Print Archive

Crossref

HAL-INSU

MPG.PuRe

On smoothed analysis of quicksort and Hoare's find

Author: Fouz Mahmoud
Kufleitner Manfred
Manthey Bodo
Zeini Jahromi Nima
Publication venue: Springer Verlag
Publication date: 01/01/2011
Field of study

We provide a smoothed analysis of Hoare's find algorithm, and we revisit the smoothed analysis of quicksort. Hoare's find algorithm - often called quickselect or one-sided quicksort - is an easy-to-implement algorithm for finding the k-th smallest element of a sequence. While the worst-case number of comparisons that Hoare’s find needs is Theta(n^2), the average-case number is Theta(n). We analyze what happens between these two extremes by providing a smoothed analysis. In the first perturbation model, an adversary specifies a sequence of n numbers of [0,1], and then, to each number of the sequence, we add a random number drawn independently from the interval [0,d]. We prove that Hoare's find needs Theta(n/(d+1) sqrt(n/d) + n) comparisons in expectation if the adversary may also specify the target element (even after seeing the perturbed sequence) and slightly fewer comparisons for finding the median. In the second perturbation model, each element is marked with a probability of p, and then a random permutation is applied to the marked elements. We prove that the expected number of comparisons to find the median is Omega((1−p)n/p log n). Finally, we provide lower bounds for the smoothed number of comparisons of quicksort and Hoare’s find for the median-of-three pivot rule, which usually yields faster algorithms than always selecting the first element: The pivot is the median of the first, middle, and last element of the sequence. We show that median-of-three does not yield a significant improvement over the classic rule

CiteSeerX

Springer - Publisher Connector

University of Twente Research Information

$\mathcal{G}$ -SELC: Optimization by sequential elimination of level combinations using genetic algorithms and Gaussian processes

Author: Mandal Abhyuday
Ranjan Pritam
Wu C. F. Jeff
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 08/06/2009
Field of study

Identifying promising compounds from a vast collection of feasible compounds is an important and yet challenging problem in the pharmaceutical industry. An efficient solution to this problem will help reduce the expenditure at the early stages of drug discovery. In an attempt to solve this problem, Mandal, Wu and Johnson [Technometrics 48 (2006) 273--283] proposed the SELC algorithm. Although powerful, it fails to extract substantial information from the data to guide the search efficiently, as this methodology is not based on any statistical modeling. The proposed approach uses Gaussian Process (GP) modeling to improve upon SELC, and hence named

\mathcal{G}

-SELC. The performance of the proposed methodology is illustrated using four and five dimensional test functions. Finally, we implement the new algorithm on a real pharmaceutical data set for finding a group of chemical compounds with optimal properties.Comment: Published in at http://dx.doi.org/10.1214/08-AOAS199 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org

arXiv.org e-Print Archive

Crossref