17,140 research outputs found
Learning with Algebraic Invariances, and the Invariant Kernel Trick
When solving data analysis problems it is important to integrate prior
knowledge and/or structural invariances. This paper contributes by a novel
framework for incorporating algebraic invariance structure into kernels. In
particular, we show that algebraic properties such as sign symmetries in data,
phase independence, scaling etc. can be included easily by essentially
performing the kernel trick twice. We demonstrate the usefulness of our theory
in simulations on selected applications such as sign-invariant spectral
clustering and underdetermined ICA
Fast Approximate Spectral Clustering for Dynamic Networks
Spectral clustering is a widely studied problem, yet its complexity is
prohibitive for dynamic graphs of even modest size. We claim that it is
possible to reuse information of past cluster assignments to expedite
computation. Our approach builds on a recent idea of sidestepping the main
bottleneck of spectral clustering, i.e., computing the graph eigenvectors, by
using fast Chebyshev graph filtering of random signals. We show that the
proposed algorithm achieves clustering assignments with quality approximating
that of spectral clustering and that it can yield significant complexity
benefits when the graph dynamics are appropriately bounded
A Practical Algorithm for Reconstructing Level-1 Phylogenetic Networks
Recently much attention has been devoted to the construction of phylogenetic
networks which generalize phylogenetic trees in order to accommodate complex
evolutionary processes. Here we present an efficient, practical algorithm for
reconstructing level-1 phylogenetic networks - a type of network slightly more
general than a phylogenetic tree - from triplets. Our algorithm has been made
publicly available as the program LEV1ATHAN. It combines ideas from several
known theoretical algorithms for phylogenetic tree and network reconstruction
with two novel subroutines. Namely, an exponential-time exact and a greedy
algorithm both of which are of independent theoretical interest. Most
importantly, LEV1ATHAN runs in polynomial time and always constructs a level-1
network. If the data is consistent with a phylogenetic tree, then the algorithm
constructs such a tree. Moreover, if the input triplet set is dense and, in
addition, is fully consistent with some level-1 network, it will find such a
network. The potential of LEV1ATHAN is explored by means of an extensive
simulation study and a biological data set. One of our conclusions is that
LEV1ATHAN is able to construct networks consistent with a high percentage of
input triplets, even when these input triplets are affected by a low to
moderate level of noise
Robust hierarchical k-center clustering
One of the most popular and widely used methods for data clustering is hierarchical clustering. This clustering technique has proved useful to reveal interesting structure in the data in several applications ranging from computational biology to computer vision. Robustness is an important feature of a clustering technique if we require the clustering to be stable against small perturbations in the input data. In most applications, getting a clustering output that is robust against adversarial outliers or stochastic noise is a necessary condition for the applicability and effectiveness of the clustering technique. This is even more critical in hierarchical clustering where a small change at the bottom of the hierarchy may propagate all the way through to the top. Despite all the previous work [2, 3, 6, 8], our theoretical understanding of robust hierarchical clustering is still limited and several hierarchical clustering algorithms are not known to satisfy such robustness properties. In this paper, we study the limits of robust hierarchical k-center clustering by introducing the concept of universal hierarchical clustering and provide (almost) tight lower and upper bounds for the robust hierarchical k-center clustering problem with outliers and variants of the stochastic clustering problem. Most importantly we present a constant-factor approximation for optimal hierarchical k-center with at most z outliers using a universal set of at most O(z2) set of outliers and show that this result is tight. Moreover we show the necessity of using a universal set of outliers in order to compute an approximately optimal hierarchical k-center with a diffierent set of outliers for each k
- …