1,005 research outputs found
The Hidden Convexity of Spectral Clustering
In recent years, spectral clustering has become a standard method for data
analysis used in a broad range of applications. In this paper we propose a new
class of algorithms for multiway spectral clustering based on optimization of a
certain "contrast function" over the unit sphere. These algorithms, partly
inspired by certain Independent Component Analysis techniques, are simple, easy
to implement and efficient.
Geometrically, the proposed algorithms can be interpreted as hidden basis
recovery by means of function optimization. We give a complete characterization
of the contrast functions admissible for provable basis recovery. We show how
these conditions can be interpreted as a "hidden convexity" of our optimization
problem on the sphere; interestingly, we use efficient convex maximization
rather than the more common convex minimization. We also show encouraging
experimental results on real and simulated data.Comment: 22 page
The Haar Wavelet Transform of a Dendrogram: Additional Notes
We consider the wavelet transform of a finite, rooted, node-ranked, -way
tree, focusing on the case of binary () trees. We study a Haar wavelet
transform on this tree. Wavelet transforms allow for multiresolution analysis
through translation and dilation of a wavelet function. We explore how this
works in our tree context.Comment: 37 pp, 1 fig. Supplementary material to "The Haar Wavelet Transform
of a Dendrogram", http://arxiv.org/abs/cs.IR/060810
Estimating parameters of a multipartite loglinear graph model via the EM algorithm
We will amalgamate the Rash model (for rectangular binary tables) and the
newly introduced - models (for random undirected graphs) in the
framework of a semiparametric probabilistic graph model. Our purpose is to give
a partition of the vertices of an observed graph so that the generated
subgraphs and bipartite graphs obey these models, where their strongly
connected parameters give multiscale evaluation of the vertices at the same
time. In this way, a heterogeneous version of the stochastic block model is
built via mixtures of loglinear models and the parameters are estimated with a
special EM iteration. In the context of social networks, the clusters can be
identified with social groups and the parameters with attitudes of people of
one group towards people of the other, which attitudes depend on the cluster
memberships. The algorithm is applied to randomly generated and real-word data
- …