15,301 research outputs found
Active Discovery of Network Roles for Predicting the Classes of Network Nodes
Nodes in real world networks often have class labels, or underlying
attributes, that are related to the way in which they connect to other nodes.
Sometimes this relationship is simple, for instance nodes of the same class are
may be more likely to be connected. In other cases, however, this is not true,
and the way that nodes link in a network exhibits a different, more complex
relationship to their attributes. Here, we consider networks in which we know
how the nodes are connected, but we do not know the class labels of the nodes
or how class labels relate to the network links. We wish to identify the best
subset of nodes to label in order to learn this relationship between node
attributes and network links. We can then use this discovered relationship to
accurately predict the class labels of the rest of the network nodes.
We present a model that identifies groups of nodes with similar link
patterns, which we call network roles, using a generative blockmodel. The model
then predicts labels by learning the mapping from network roles to class labels
using a maximum margin classifier. We choose a subset of nodes to label
according to an iterative margin-based active learning strategy. By integrating
the discovery of network roles with the classifier optimisation, the active
learning process can adapt the network roles to better represent the network
for node classification. We demonstrate the model by exploring a selection of
real world networks, including a marine food web and a network of English
words. We show that, in contrast to other network classifiers, this model
achieves good classification accuracy for a range of networks with different
relationships between class labels and network links
PAC-Bayes and Domain Adaptation
We provide two main contributions in PAC-Bayesian theory for domain
adaptation where the objective is to learn, from a source distribution, a
well-performing majority vote on a different, but related, target distribution.
Firstly, we propose an improvement of the previous approach we proposed in
Germain et al. (2013), which relies on a novel distribution pseudodistance
based on a disagreement averaging, allowing us to derive a new tighter domain
adaptation bound for the target risk. While this bound stands in the spirit of
common domain adaptation works, we derive a second bound (introduced in Germain
et al., 2016) that brings a new perspective on domain adaptation by deriving an
upper bound on the target risk where the distributions' divergence-expressed as
a ratio-controls the trade-off between a source error measure and the target
voters' disagreement. We discuss and compare both results, from which we obtain
PAC-Bayesian generalization bounds. Furthermore, from the PAC-Bayesian
specialization to linear classifiers, we infer two learning algorithms, and we
evaluate them on real data.Comment: Neurocomputing, Elsevier, 2019. arXiv admin note: substantial text
overlap with arXiv:1503.0694
- …