Search CORE

13 research outputs found

Component-by-Component Construction of Low-discrepancy Point Sets of Small Size

Author: Doerr B.
Gnewuch M.
Kritzer P.
Pillichshammer F.
Publication venue: 'Walter de Gruyter GmbH'
Publication date: 01/01/2008
Field of study

Localized Cumulative Distributions and a Multivariate Generalization of the Cramér-von Mises Distance

Author: Hanebeck Uwe D.
Klumpp Vesa
Publication venue: Institute of Electrical and Electronics Engineers
Publication date: 01/01/2008
Field of study

This paper is concerned with distances for comparing multivariate random vectors with a special focus on the case that at least one of the random vectors is of discrete type, i.e., assumes values from a discrete set only. The first contribution is a new type of characterization of multivariate random quantities, the so called Localized Cumulative Distribution (LCD) that, in contrast to the conventional definition of a cumulative distribution, is unique and symmetric. Based on the LCDs of the random vectors under consideration, the second contribution is the definition of generalized distance measures that are suitable for the multivariate case. These distances are used for both analysis and synthesis purposes. Analysis is concerned with assessing whether a given sample stems from a given continuous distribution. Synthesis is concerned with both density estimation, i.e., calculating a suitable continuous approximation of a given sample, and density discretization, i.e., approximation of a given continuous random vector by a discrete one

KITopen

Sample dispersion is better than sample discrepancy for classification

Author: Deffuant Guillaume
Gandar Benoît
Loosli Gaëlle
Publication venue: HAL CCSD
Publication date: 01/10/2010
Field of study

We want to generate learning data within the context of active learning. First, we recall theoretical results proposing discrepancy as a criterion for generating sample in regression. We show surprisingly that theoretical results about low discrepancy sequences in regression problems are not adequate for classification problems. Secondly we propose dispersion as a criterion for generating data. Then, we present numerical experiments which have a good degree of adequacy with theory

HAL Clermont Université

Hal-Diderot

Constructing Low Star Discrepancy Point Sets with Genetic Algorithms

Author: De Rainville Francois-Michel
Doerr Carola
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2013
Field of study

Geometric discrepancies are standard measures to quantify the irregularity of distributions. They are an important notion in numerical integration. One of the most important discrepancy notions is the so-called \emph{star discrepancy}. Roughly speaking, a point set of low star discrepancy value allows for a small approximation error in quasi-Monte Carlo integration. It is thus the most studied discrepancy notion. In this work we present a new algorithm to compute point sets of low star discrepancy. The two components of the algorithm (for the optimization and the evaluation, respectively) are based on evolutionary principles. Our algorithm clearly outperforms existing approaches. To the best of our knowledge, it is also the first algorithm which can be adapted easily to optimize inverse star discrepancies.Comment: Extended abstract appeared at GECCO 2013. v2: corrected 3 numbers in table

arXiv.org e-Print Archive

MPG.PuRe

Heuristic Approaches to Obtain Low-Discrepancy Point Sets via Subset Selection

Author: Clément François
Doerr Carola
Paquete Luís
Publication venue
Publication date: 27/06/2023
Field of study

Building upon the exact methods presented in our earlier work [J. Complexity, 2022], we introduce a heuristic approach for the star discrepancy subset selection problem. The heuristic gradually improves the current-best subset by replacing one of its elements at a time. While we prove that the heuristic does not necessarily return an optimal solution, we obtain very promising results for all tested dimensions. For example, for moderate point set sizes

30 \leq n \leq 240

in dimension 6, we obtain point sets with

L_{\infty}

star discrepancy up to 35% better than that of the first

n

points of the Sobol' sequence. Our heuristic works in all dimensions, the main limitation being the precision of the discrepancy calculation algorithms. We also provide a comparison with a recent energy functional introduced by Steinerberger [J. Complexity, 2019], showing that our heuristic performs better on all tested instances

arXiv.org e-Print Archive

Recommended from our members

Entropy, Randomization, Derandomization, and Discrepancy

Author: Gnewuch Michael
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/2011
Field of study

The star discrepancy is a measure of how uniformly distributed a finite point set is in the d-dimensional unit cube. It is related to high-dimensional numerical integration of certain function classes as expressed by the Koksma-Hlawka inequality. A sharp version of this inequality states that the worst-case error of approximating the integral of functions from the unit ball of some Sobolev space by an equal-weight cubature is exactly the star discrepancy of the set of sample points. In many applications, as, e.g., in physics, quantum chemistry or finance, it is essential to approximate high-dimensional integrals. Thus with regard to the Koksma- Hlawka inequality the following three questions are very important: (i) What are good bounds with explicitly given dependence on the dimension d for the smallest possible discrepancy of any n-point set for moderate n? (ii) How can we construct point sets efficiently that satisfy such bounds? (iii) How can we calculate the discrepancy of given point sets efficiently? We want to discuss these questions and survey and explain some approaches to tackle them relying on metric entropy, randomization, and derandomization

Columbia University Academic Commons