121,207 research outputs found

    Realizing Euclidean distance matrices by sphere intersection

    Get PDF
    International audienceThis paper presents the theoretical properties of an algorithm to find a realization of a (full) n × n Euclidean distance matrix in the smallest possible embedding dimension. Our algorithm performs linearly in n, and quadratically in the minimum embedding dimension, which is an improvement w.r.t. other algorithms

    Detecting Simultaneous Integer Relations for Several Real Vectors

    Full text link
    An algorithm which either finds an nonzero integer vector m{\mathbf m} for given tt real nn-dimensional vectors x1,...,xt{\mathbf x}_1,...,{\mathbf x}_t such that xiTm=0{\mathbf x}_i^T{\mathbf m}=0 or proves that no such integer vector with norm less than a given bound exists is presented in this paper. The cost of the algorithm is at most O(n4+n3logλ(X)){\mathcal O}(n^4 + n^3 \log \lambda(X)) exact arithmetic operations in dimension nn and the least Euclidean norm λ(X)\lambda(X) of such integer vectors. It matches the best complexity upper bound known for this problem. Experimental data show that the algorithm is better than an already existing algorithm in the literature. In application, the algorithm is used to get a complete method for finding the minimal polynomial of an unknown complex algebraic number from its approximation, which runs even faster than the corresponding \emph{Maple} built-in function.Comment: 10 page

    Statistical properties of determinantal point processes in high-dimensional Euclidean spaces

    Full text link
    The goal of this paper is to quantitatively describe some statistical properties of higher-dimensional determinantal point processes with a primary focus on the nearest-neighbor distribution functions. Toward this end, we express these functions as determinants of N×NN\times N matrices and then extrapolate to NN\to\infty. This formulation allows for a quick and accurate numerical evaluation of these quantities for point processes in Euclidean spaces of dimension dd. We also implement an algorithm due to Hough \emph{et. al.} \cite{hough2006dpa} for generating configurations of determinantal point processes in arbitrary Euclidean spaces, and we utilize this algorithm in conjunction with the aforementioned numerical results to characterize the statistical properties of what we call the Fermi-sphere point process for d=1d = 1 to 4. This homogeneous, isotropic determinantal point process, discussed also in a companion paper \cite{ToScZa08}, is the high-dimensional generalization of the distribution of eigenvalues on the unit circle of a random matrix from the circular unitary ensemble (CUE). In addition to the nearest-neighbor probability distribution, we are able to calculate Voronoi cells and nearest-neighbor extrema statistics for the Fermi-sphere point process and discuss these as the dimension dd is varied. The results in this paper accompany and complement analytical properties of higher-dimensional determinantal point processes developed in \cite{ToScZa08}.Comment: 42 pages, 17 figure

    Non-Gaussian Component Analysis using Entropy Methods

    Full text link
    Non-Gaussian component analysis (NGCA) is a problem in multidimensional data analysis which, since its formulation in 2006, has attracted considerable attention in statistics and machine learning. In this problem, we have a random variable XX in nn-dimensional Euclidean space. There is an unknown subspace Γ\Gamma of the nn-dimensional Euclidean space such that the orthogonal projection of XX onto Γ\Gamma is standard multidimensional Gaussian and the orthogonal projection of XX onto Γ\Gamma^{\perp}, the orthogonal complement of Γ\Gamma, is non-Gaussian, in the sense that all its one-dimensional marginals are different from the Gaussian in a certain metric defined in terms of moments. The NGCA problem is to approximate the non-Gaussian subspace Γ\Gamma^{\perp} given samples of XX. Vectors in Γ\Gamma^{\perp} correspond to `interesting' directions, whereas vectors in Γ\Gamma correspond to the directions where data is very noisy. The most interesting applications of the NGCA model is for the case when the magnitude of the noise is comparable to that of the true signal, a setting in which traditional noise reduction techniques such as PCA don't apply directly. NGCA is also related to dimension reduction and to other data analysis problems such as ICA. NGCA-like problems have been studied in statistics for a long time using techniques such as projection pursuit. We give an algorithm that takes polynomial time in the dimension nn and has an inverse polynomial dependence on the error parameter measuring the angle distance between the non-Gaussian subspace and the subspace output by the algorithm. Our algorithm is based on relative entropy as the contrast function and fits under the projection pursuit framework. The techniques we develop for analyzing our algorithm maybe of use for other related problems

    The Bane of Low-Dimensionality Clustering

    Get PDF
    In this paper, we give a conditional lower bound of nΩ(k)n^{\Omega(k)} on running time for the classic k-median and k-means clustering objectives (where n is the size of the input), even in low-dimensional Euclidean space of dimension four, assuming the Exponential Time Hypothesis (ETH). We also consider k-median (and k-means) with penalties where each point need not be assigned to a center, in which case it must pay a penalty, and extend our lower bound to at least three-dimensional Euclidean space. This stands in stark contrast to many other geometric problems such as the traveling salesman problem, or computing an independent set of unit spheres. While these problems benefit from the so-called (limited) blessing of dimensionality, as they can be solved in time nO(k11/d)n^{O(k^{1-1/d})} or 2n11/d2^{n^{1-1/d}} in d dimensions, our work shows that widely-used clustering objectives have a lower bound of nΩ(k)n^{\Omega(k)}, even in dimension four. We complete the picture by considering the two-dimensional case: we show that there is no algorithm that solves the penalized version in time less than no(k)n^{o(\sqrt{k})}, and provide a matching upper bound of nO(k)n^{O(\sqrt{k})}. The main tool we use to establish these lower bounds is the placement of points on the moment curve, which takes its inspiration from constructions of point sets yielding Delaunay complexes of high complexity
    corecore