121,207 research outputs found
Realizing Euclidean distance matrices by sphere intersection
International audienceThis paper presents the theoretical properties of an algorithm to find a realization of a (full) n × n Euclidean distance matrix in the smallest possible embedding dimension. Our algorithm performs linearly in n, and quadratically in the minimum embedding dimension, which is an improvement w.r.t. other algorithms
Detecting Simultaneous Integer Relations for Several Real Vectors
An algorithm which either finds an nonzero integer vector for
given real -dimensional vectors such
that or proves that no such integer vector with
norm less than a given bound exists is presented in this paper. The cost of the
algorithm is at most exact arithmetic
operations in dimension and the least Euclidean norm of such
integer vectors. It matches the best complexity upper bound known for this
problem. Experimental data show that the algorithm is better than an already
existing algorithm in the literature. In application, the algorithm is used to
get a complete method for finding the minimal polynomial of an unknown complex
algebraic number from its approximation, which runs even faster than the
corresponding \emph{Maple} built-in function.Comment: 10 page
Statistical properties of determinantal point processes in high-dimensional Euclidean spaces
The goal of this paper is to quantitatively describe some statistical
properties of higher-dimensional determinantal point processes with a primary
focus on the nearest-neighbor distribution functions. Toward this end, we
express these functions as determinants of matrices and then
extrapolate to . This formulation allows for a quick and accurate
numerical evaluation of these quantities for point processes in Euclidean
spaces of dimension . We also implement an algorithm due to Hough \emph{et.
al.} \cite{hough2006dpa} for generating configurations of determinantal point
processes in arbitrary Euclidean spaces, and we utilize this algorithm in
conjunction with the aforementioned numerical results to characterize the
statistical properties of what we call the Fermi-sphere point process for to 4. This homogeneous, isotropic determinantal point process, discussed
also in a companion paper \cite{ToScZa08}, is the high-dimensional
generalization of the distribution of eigenvalues on the unit circle of a
random matrix from the circular unitary ensemble (CUE). In addition to the
nearest-neighbor probability distribution, we are able to calculate Voronoi
cells and nearest-neighbor extrema statistics for the Fermi-sphere point
process and discuss these as the dimension is varied. The results in this
paper accompany and complement analytical properties of higher-dimensional
determinantal point processes developed in \cite{ToScZa08}.Comment: 42 pages, 17 figure
Non-Gaussian Component Analysis using Entropy Methods
Non-Gaussian component analysis (NGCA) is a problem in multidimensional data
analysis which, since its formulation in 2006, has attracted considerable
attention in statistics and machine learning. In this problem, we have a random
variable in -dimensional Euclidean space. There is an unknown subspace
of the -dimensional Euclidean space such that the orthogonal
projection of onto is standard multidimensional Gaussian and the
orthogonal projection of onto , the orthogonal complement
of , is non-Gaussian, in the sense that all its one-dimensional
marginals are different from the Gaussian in a certain metric defined in terms
of moments. The NGCA problem is to approximate the non-Gaussian subspace
given samples of .
Vectors in correspond to `interesting' directions, whereas
vectors in correspond to the directions where data is very noisy. The
most interesting applications of the NGCA model is for the case when the
magnitude of the noise is comparable to that of the true signal, a setting in
which traditional noise reduction techniques such as PCA don't apply directly.
NGCA is also related to dimension reduction and to other data analysis problems
such as ICA. NGCA-like problems have been studied in statistics for a long time
using techniques such as projection pursuit.
We give an algorithm that takes polynomial time in the dimension and has
an inverse polynomial dependence on the error parameter measuring the angle
distance between the non-Gaussian subspace and the subspace output by the
algorithm. Our algorithm is based on relative entropy as the contrast function
and fits under the projection pursuit framework. The techniques we develop for
analyzing our algorithm maybe of use for other related problems
The Bane of Low-Dimensionality Clustering
In this paper, we give a conditional lower bound of on
running time for the classic k-median and k-means clustering objectives (where
n is the size of the input), even in low-dimensional Euclidean space of
dimension four, assuming the Exponential Time Hypothesis (ETH). We also
consider k-median (and k-means) with penalties where each point need not be
assigned to a center, in which case it must pay a penalty, and extend our lower
bound to at least three-dimensional Euclidean space.
This stands in stark contrast to many other geometric problems such as the
traveling salesman problem, or computing an independent set of unit spheres.
While these problems benefit from the so-called (limited) blessing of
dimensionality, as they can be solved in time or
in d dimensions, our work shows that widely-used clustering
objectives have a lower bound of , even in dimension four.
We complete the picture by considering the two-dimensional case: we show that
there is no algorithm that solves the penalized version in time less than
, and provide a matching upper bound of .
The main tool we use to establish these lower bounds is the placement of
points on the moment curve, which takes its inspiration from constructions of
point sets yielding Delaunay complexes of high complexity
- …