235 research outputs found
The Hidden Convexity of Spectral Clustering
In recent years, spectral clustering has become a standard method for data
analysis used in a broad range of applications. In this paper we propose a new
class of algorithms for multiway spectral clustering based on optimization of a
certain "contrast function" over the unit sphere. These algorithms, partly
inspired by certain Independent Component Analysis techniques, are simple, easy
to implement and efficient.
Geometrically, the proposed algorithms can be interpreted as hidden basis
recovery by means of function optimization. We give a complete characterization
of the contrast functions admissible for provable basis recovery. We show how
these conditions can be interpreted as a "hidden convexity" of our optimization
problem on the sphere; interestingly, we use efficient convex maximization
rather than the more common convex minimization. We also show encouraging
experimental results on real and simulated data.Comment: 22 page
Lower Bounds for the Average and Smoothed Number of Pareto Optima
Smoothed analysis of multiobjective 0-1 linear optimization has drawn
considerable attention recently. The number of Pareto-optimal solutions (i.e.,
solutions with the property that no other solution is at least as good in all
the coordinates and better in at least one) for multiobjective optimization
problems is the central object of study. In this paper, we prove several lower
bounds for the expected number of Pareto optima. Our basic result is a lower
bound of \Omega_d(n^(d-1)) for optimization problems with d objectives and n
variables under fairly general conditions on the distributions of the linear
objectives. Our proof relates the problem of lower bounding the number of
Pareto optima to results in geometry connected to arrangements of hyperplanes.
We use our basic result to derive (1) To our knowledge, the first lower bound
for natural multiobjective optimization problems. We illustrate this for the
maximum spanning tree problem with randomly chosen edge weights. Our technique
is sufficiently flexible to yield such lower bounds for other standard
objective functions studied in this setting (such as, multiobjective shortest
path, TSP tour, matching). (2) Smoothed lower bound of min {\Omega_d(n^(d-1.5)
\phi^{(d-log d) (1-\Theta(1/\phi))}), 2^{\Theta(n)}}$ for the 0-1 knapsack
problem with d profits for phi-semirandom distributions for a version of the
knapsack problem. This improves the recent lower bound of Brunsch and Roeglin
Heavy-tailed Independent Component Analysis
Independent component analysis (ICA) is the problem of efficiently recovering
a matrix from i.i.d. observations of
where is a random vector with mutually independent
coordinates. This problem has been intensively studied, but all existing
efficient algorithms with provable guarantees require that the coordinates
have finite fourth moments. We consider the heavy-tailed ICA problem
where we do not make this assumption, about the second moment. This problem
also has received considerable attention in the applied literature. In the
present work, we first give a provably efficient algorithm that works under the
assumption that for constant , each has finite
-moment, thus substantially weakening the moment requirement
condition for the ICA problem to be solvable. We then give an algorithm that
works under the assumption that matrix has orthogonal columns but requires
no moment assumptions. Our techniques draw ideas from convex geometry and
exploit standard properties of the multivariate spherical Gaussian distribution
in a novel way.Comment: 30 page
- …
