2,830 research outputs found
Provable Self-Representation Based Outlier Detection in a Union of Subspaces
Many computer vision tasks involve processing large amounts of data
contaminated by outliers, which need to be detected and rejected. While outlier
detection methods based on robust statistics have existed for decades, only
recently have methods based on sparse and low-rank representation been
developed along with guarantees of correct outlier detection when the inliers
lie in one or more low-dimensional subspaces. This paper proposes a new outlier
detection method that combines tools from sparse representation with random
walks on a graph. By exploiting the property that data points can be expressed
as sparse linear combinations of each other, we obtain an asymmetric affinity
matrix among data points, which we use to construct a weighted directed graph.
By defining a suitable Markov Chain from this graph, we establish a connection
between inliers/outliers and essential/inessential states of the Markov chain,
which allows us to detect outliers by using random walks. We provide a
theoretical analysis that justifies the correctness of our method under
geometric and connectivity assumptions. Experimental results on image databases
demonstrate its superiority with respect to state-of-the-art sparse and
low-rank outlier detection methods.Comment: 16 pages. CVPR 2017 spotlight oral presentatio
Sampling random graph homomorphisms and applications to network data analysis
A graph homomorphism is a map between two graphs that preserves adjacency
relations. We consider the problem of sampling a random graph homomorphism from
a graph into a large network . We propose two complementary
MCMC algorithms for sampling a random graph homomorphisms and establish bounds
on their mixing times and concentration of their time averages. Based on our
sampling algorithms, we propose a novel framework for network data analysis
that circumvents some of the drawbacks in methods based on independent and
neigborhood sampling. Various time averages of the MCMC trajectory give us
various computable observables, including well-known ones such as homomorphism
density and average clustering coefficient and their generalizations.
Furthermore, we show that these network observables are stable with respect to
a suitably renormalized cut distance between networks. We provide various
examples and simulations demonstrating our framework through synthetic
networks. We also apply our framework for network clustering and classification
problems using the Facebook100 dataset and Word Adjacency Networks of a set of
classic novels.Comment: 51 pages, 33 figures, 2 table
- …