235 research outputs found

    The Hidden Convexity of Spectral Clustering

    Full text link
    In recent years, spectral clustering has become a standard method for data analysis used in a broad range of applications. In this paper we propose a new class of algorithms for multiway spectral clustering based on optimization of a certain "contrast function" over the unit sphere. These algorithms, partly inspired by certain Independent Component Analysis techniques, are simple, easy to implement and efficient. Geometrically, the proposed algorithms can be interpreted as hidden basis recovery by means of function optimization. We give a complete characterization of the contrast functions admissible for provable basis recovery. We show how these conditions can be interpreted as a "hidden convexity" of our optimization problem on the sphere; interestingly, we use efficient convex maximization rather than the more common convex minimization. We also show encouraging experimental results on real and simulated data.Comment: 22 page

    Lower Bounds for the Average and Smoothed Number of Pareto Optima

    Get PDF
    Smoothed analysis of multiobjective 0-1 linear optimization has drawn considerable attention recently. The number of Pareto-optimal solutions (i.e., solutions with the property that no other solution is at least as good in all the coordinates and better in at least one) for multiobjective optimization problems is the central object of study. In this paper, we prove several lower bounds for the expected number of Pareto optima. Our basic result is a lower bound of \Omega_d(n^(d-1)) for optimization problems with d objectives and n variables under fairly general conditions on the distributions of the linear objectives. Our proof relates the problem of lower bounding the number of Pareto optima to results in geometry connected to arrangements of hyperplanes. We use our basic result to derive (1) To our knowledge, the first lower bound for natural multiobjective optimization problems. We illustrate this for the maximum spanning tree problem with randomly chosen edge weights. Our technique is sufficiently flexible to yield such lower bounds for other standard objective functions studied in this setting (such as, multiobjective shortest path, TSP tour, matching). (2) Smoothed lower bound of min {\Omega_d(n^(d-1.5) \phi^{(d-log d) (1-\Theta(1/\phi))}), 2^{\Theta(n)}}$ for the 0-1 knapsack problem with d profits for phi-semirandom distributions for a version of the knapsack problem. This improves the recent lower bound of Brunsch and Roeglin

    Heavy-tailed Independent Component Analysis

    Full text link
    Independent component analysis (ICA) is the problem of efficiently recovering a matrix ARn×nA \in \mathbb{R}^{n\times n} from i.i.d. observations of X=ASX=AS where SRnS \in \mathbb{R}^n is a random vector with mutually independent coordinates. This problem has been intensively studied, but all existing efficient algorithms with provable guarantees require that the coordinates SiS_i have finite fourth moments. We consider the heavy-tailed ICA problem where we do not make this assumption, about the second moment. This problem also has received considerable attention in the applied literature. In the present work, we first give a provably efficient algorithm that works under the assumption that for constant γ>0\gamma > 0, each SiS_i has finite (1+γ)(1+\gamma)-moment, thus substantially weakening the moment requirement condition for the ICA problem to be solvable. We then give an algorithm that works under the assumption that matrix AA has orthogonal columns but requires no moment assumptions. Our techniques draw ideas from convex geometry and exploit standard properties of the multivariate spherical Gaussian distribution in a novel way.Comment: 30 page
    corecore