4,393 research outputs found
Multi-view constrained clustering with an incomplete mapping between views
Multi-view learning algorithms typically assume a complete bipartite mapping
between the different views in order to exchange information during the
learning process. However, many applications provide only a partial mapping
between the views, creating a challenge for current methods. To address this
problem, we propose a multi-view algorithm based on constrained clustering that
can operate with an incomplete mapping. Given a set of pairwise constraints in
each view, our approach propagates these constraints using a local similarity
measure to those instances that can be mapped to the other views, allowing the
propagated constraints to be transferred across views via the partial mapping.
It uses co-EM to iteratively estimate the propagation within each view based on
the current clustering model, transfer the constraints across views, and then
update the clustering model. By alternating the learning process between views,
this approach produces a unified clustering model that is consistent with all
views. We show that this approach significantly improves clustering performance
over several other methods for transferring constraints and allows multi-view
clustering to be reliably applied when given a limited mapping between the
views. Our evaluation reveals that the propagated constraints have high
precision with respect to the true clusters in the data, explaining their
benefit to clustering performance in both single- and multi-view learning
scenarios
Understanding and modeling the small-world phenomenon in dynamic networks
The small-world phenomenon first introduced in the context of static graphs consists of graphs with high clustering coefficient and low shortest path length. This is an intrinsic property of many real complex static networks. Recent research has shown that this structure is also observable in dynamic networks but how it emerges remains an open problem. In this paper, we propose a model capable of capturing the small-world behavior observed in various real traces. We then study information diffusion in such small-world networks. Analytical and simulation results with epidemic model show that the small-world structure increases dramatically the information spreading speed in dynamic networks
Estimating Maximally Probable Constrained Relations by Mathematical Programming
Estimating a constrained relation is a fundamental problem in machine
learning. Special cases are classification (the problem of estimating a map
from a set of to-be-classified elements to a set of labels), clustering (the
problem of estimating an equivalence relation on a set) and ranking (the
problem of estimating a linear order on a set). We contribute a family of
probability measures on the set of all relations between two finite, non-empty
sets, which offers a joint abstraction of multi-label classification,
correlation clustering and ranking by linear ordering. Estimating (learning) a
maximally probable measure, given (a training set of) related and unrelated
pairs, is a convex optimization problem. Estimating (inferring) a maximally
probable relation, given a measure, is a 01-linear program. It is solved in
linear time for maps. It is NP-hard for equivalence relations and linear
orders. Practical solutions for all three cases are shown in experiments with
real data. Finally, estimating a maximally probable measure and relation
jointly is posed as a mixed-integer nonlinear program. This formulation
suggests a mathematical programming approach to semi-supervised learning.Comment: 16 page
Subgraphs and network motifs in geometric networks
Many real-world networks describe systems in which interactions decay with
the distance between nodes. Examples include systems constrained in real space
such as transportation and communication networks, as well as systems
constrained in abstract spaces such as multivariate biological or economic
datasets and models of social networks. These networks often display network
motifs: subgraphs that recur in the network much more often than in randomized
networks. To understand the origin of the network motifs in these networks, it
is important to study the subgraphs and network motifs that arise solely from
geometric constraints. To address this, we analyze geometric network models, in
which nodes are arranged on a lattice and edges are formed with a probability
that decays with the distance between nodes. We present analytical solutions
for the numbers of all 3 and 4-node subgraphs, in both directed and
non-directed geometric networks. We also analyze geometric networks with
arbitrary degree sequences, and models with a field that biases for directed
edges in one direction. Scaling rules for scaling of subgraph numbers with
system size, lattice dimension and interaction range are given. Several
invariant measures are found, such as the ratio of feedback and feed-forward
loops, which do not depend on system size, dimension or connectivity function.
We find that network motifs in many real-world networks, including social
networks and neuronal networks, are not captured solely by these geometric
models. This is in line with recent evidence that biological network motifs
were selected as basic circuit elements with defined information-processing
functions.Comment: 9 pages, 6 figure
- …