8,033 research outputs found
Approximating Tverberg Points in Linear Time for Any Fixed Dimension
Let P be a d-dimensional n-point set. A Tverberg-partition of P is a
partition of P into r sets P_1, ..., P_r such that the convex hulls conv(P_1),
..., conv(P_r) have non-empty intersection. A point in the intersection of the
conv(P_i)'s is called a Tverberg point of depth r for P. A classic result by
Tverberg implies that there always exists a Tverberg partition of size n/(d+1),
but it is not known how to find such a partition in polynomial time. Therefore,
approximate solutions are of interest.
We describe a deterministic algorithm that finds a Tverberg partition of size
n/4(d+1)^3 in time d^{O(log d)} n. This means that for every fixed dimension we
can compute an approximate Tverberg point (and hence also an approximate
centerpoint) in linear time. Our algorithm is obtained by combining a novel
lifting approach with a recent result by Miller and Sheehy (2010).Comment: 14 pages, 2 figures. A preliminary version appeared in SoCG 2012.
This version removes an incorrect example at the end of Section 3.
Regression Depth and Center Points
We show that, for any set of n points in d dimensions, there exists a
hyperplane with regression depth at least ceiling(n/(d+1)). as had been
conjectured by Rousseeuw and Hubert. Dually, for any arrangement of n
hyperplanes in d dimensions there exists a point that cannot escape to infinity
without crossing at least ceiling(n/(d+1)) hyperplanes. We also apply our
approach to related questions on the existence of partitions of the data into
subsets such that a common plane has nonzero regression depth in each subset,
and to the computational complexity of regression depth problems.Comment: 14 pages, 3 figure
Approximating the Distribution of the Median and other Robust Estimators on Uncertain Data
Robust estimators, like the median of a point set, are important for data
analysis in the presence of outliers. We study robust estimators for
locationally uncertain points with discrete distributions. That is, each point
in a data set has a discrete probability distribution describing its location.
The probabilistic nature of uncertain data makes it challenging to compute such
estimators, since the true value of the estimator is now described by a
distribution rather than a single point. We show how to construct and estimate
the distribution of the median of a point set. Building the approximate support
of the distribution takes near-linear time, and assigning probability to that
support takes quadratic time. We also develop a general approximation technique
for distributions of robust estimators with respect to ranges with bounded VC
dimension. This includes the geometric median for high dimensions and the
Siegel estimator for linear regression.Comment: Full version of a paper to appear at SoCG 201
Fast DD-classification of functional data
A fast nonparametric procedure for classifying functional data is introduced.
It consists of a two-step transformation of the original data plus a classifier
operating on a low-dimensional hypercube. The functional data are first mapped
into a finite-dimensional location-slope space and then transformed by a
multivariate depth function into the -plot, which is a subset of the unit
hypercube. This transformation yields a new notion of depth for functional
data. Three alternative depth functions are employed for this, as well as two
rules for the final classification on . The resulting classifier has
to be cross-validated over a small range of parameters only, which is
restricted by a Vapnik-Cervonenkis bound. The entire methodology does not
involve smoothing techniques, is completely nonparametric and allows to achieve
Bayes optimality under standard distributional settings. It is robust,
efficiently computable, and has been implemented in an R environment.
Applicability of the new approach is demonstrated by simulations as well as a
benchmark study
Multidimensional trimming based on projection depth
As estimators of location parameters, univariate trimmed means are well known
for their robustness and efficiency. They can serve as robust alternatives to
the sample mean while possessing high efficiencies at normal as well as
heavy-tailed models. This paper introduces multidimensional trimmed means based
on projection depth induced regions. Robustness of these depth trimmed means is
investigated in terms of the influence function and finite sample breakdown
point. The influence function captures the local robustness whereas the
breakdown point measures the global robustness of estimators. It is found that
the projection depth trimmed means are highly robust locally as well as
globally. Asymptotics of the depth trimmed means are investigated via those of
the directional radius of the depth induced regions. The strong consistency,
asymptotic representation and limiting distribution of the depth trimmed means
are obtained. Relative to the mean and other leading competitors, the depth
trimmed means are highly efficient at normal or symmetric models and
overwhelmingly more efficient when these models are contaminated. Simulation
studies confirm the validity of the asymptotic efficiency results at finite
samples.Comment: Published at http://dx.doi.org/10.1214/009053606000000713 in the
Annals of Statistics (http://www.imstat.org/aos/) by the Institute of
Mathematical Statistics (http://www.imstat.org
Profile extrema for visualizing and quantifying uncertainties on excursion regions. Application to coastal flooding
We consider the problem of describing excursion sets of a real-valued
function , i.e. the set of inputs where is above a fixed threshold. Such
regions are hard to visualize if the input space dimension, , is higher than
2. For a given projection matrix from the input space to a lower dimensional
(usually ) subspace, we introduce profile sup (inf) functions that
associate to each point in the projection's image the sup (inf) of the function
constrained over the pre-image of this point by the considered projection.
Plots of profile extrema functions convey a simple, although intrinsically
partial, visualization of the set. We consider expensive to evaluate functions
where only a very limited number of evaluations, , is available, e.g.
, and we surrogate with a posterior quantity of a Gaussian process
(GP) model. We first compute profile extrema functions for the posterior mean
given evaluations of . We quantify the uncertainty on such estimates by
studying the distribution of GP profile extrema with posterior
quasi-realizations obtained from an approximating process. We control such
approximation with a bound inherited from the Borell-TIS inequality. The
technique is applied to analytical functions () and to a -dimensional
coastal flooding test case for a site located on the Atlantic French coast.
Here is a numerical model returning the area of flooded surface in the
coastal region given some offshore conditions. Profile extrema functions
allowed us to better understand which offshore conditions impact large flooding
events
- …