2,096 research outputs found
On Consistency of Graph-based Semi-supervised Learning
Graph-based semi-supervised learning is one of the most popular methods in
machine learning. Some of its theoretical properties such as bounds for the
generalization error and the convergence of the graph Laplacian regularizer
have been studied in computer science and statistics literatures. However, a
fundamental statistical property, the consistency of the estimator from this
method has not been proved. In this article, we study the consistency problem
under a non-parametric framework. We prove the consistency of graph-based
learning in the case that the estimated scores are enforced to be equal to the
observed responses for the labeled data. The sample sizes of both labeled and
unlabeled data are allowed to grow in this result. When the estimated scores
are not required to be equal to the observed responses, a tuning parameter is
used to balance the loss function and the graph Laplacian regularizer. We give
a counterexample demonstrating that the estimator for this case can be
inconsistent. The theoretical findings are supported by numerical studies.Comment: This paper is accepted by 2019 IEEE 39th International Conference on
Distributed Computing Systems (ICDCS
Community extraction for social networks
Analysis of networks and in particular discovering communities within
networks has been a focus of recent work in several fields, with applications
ranging from citation and friendship networks to food webs and gene regulatory
networks. Most of the existing community detection methods focus on
partitioning the entire network into communities, with the expectation of many
ties within communities and few ties between. However, many networks contain
nodes that do not fit in with any of the communities, and forcing every node
into a community can distort results. Here we propose a new framework that
focuses on community extraction instead of partition, extracting one community
at a time. The main idea behind extraction is that the strength of a community
should not depend on ties between members of other communities, but only on
ties within that community and its ties to the outside world. We show that the
new extraction criterion performs well on simulated and real networks, and
establish asymptotic consistency of our method under the block model
assumption
- …
