51,544 research outputs found
Exhaustive and Efficient Constraint Propagation: A Semi-Supervised Learning Perspective and Its Applications
This paper presents a novel pairwise constraint propagation approach by
decomposing the challenging constraint propagation problem into a set of
independent semi-supervised learning subproblems which can be solved in
quadratic time using label propagation based on k-nearest neighbor graphs.
Considering that this time cost is proportional to the number of all possible
pairwise constraints, our approach actually provides an efficient solution for
exhaustively propagating pairwise constraints throughout the entire dataset.
The resulting exhaustive set of propagated pairwise constraints are further
used to adjust the similarity matrix for constrained spectral clustering. Other
than the traditional constraint propagation on single-source data, our approach
is also extended to more challenging constraint propagation on multi-source
data where each pairwise constraint is defined over a pair of data points from
different sources. This multi-source constraint propagation has an important
application to cross-modal multimedia retrieval. Extensive results have shown
the superior performance of our approach.Comment: The short version of this paper appears as oral paper in ECCV 201
On the freezing of variables in random constraint satisfaction problems
The set of solutions of random constraint satisfaction problems (zero energy
groundstates of mean-field diluted spin glasses) undergoes several structural
phase transitions as the amount of constraints is increased. This set first
breaks down into a large number of well separated clusters. At the freezing
transition, which is in general distinct from the clustering one, some
variables (spins) take the same value in all solutions of a given cluster. In
this paper we study the critical behavior around the freezing transition, which
appears in the unfrozen phase as the divergence of the sizes of the
rearrangements induced in response to the modification of a variable. The
formalism is developed on generic constraint satisfaction problems and applied
in particular to the random satisfiability of boolean formulas and to the
coloring of random graphs. The computation is first performed in random tree
ensembles, for which we underline a connection with percolation models and with
the reconstruction problem of information theory. The validity of these results
for the original random ensembles is then discussed in the framework of the
cavity method.Comment: 32 pages, 7 figure
Citing for High Impact
The question of citation behavior has always intrigued scientists from
various disciplines. While general citation patterns have been widely studied
in the literature we develop the notion of citation projection graphs by
investigating the citations among the publications that a given paper cites. We
investigate how patterns of citations vary between various scientific
disciplines and how such patterns reflect the scientific impact of the paper.
We find that idiosyncratic citation patterns are characteristic for low impact
papers; while narrow, discipline-focused citation patterns are common for
medium impact papers. Our results show that crossing-community, or bridging
citation patters are high risk and high reward since such patterns are
characteristic for both low and high impact papers. Last, we observe that
recently citation networks are trending toward more bridging and
interdisciplinary forms.Comment: 10 pages, 6 figures, 1 tabl
Parallel Graph Partitioning for Complex Networks
Processing large complex networks like social networks or web graphs has
recently attracted considerable interest. In order to do this in parallel, we
need to partition them into pieces of about equal size. Unfortunately, previous
parallel graph partitioners originally developed for more regular mesh-like
networks do not work well for these networks. This paper addresses this problem
by parallelizing and adapting the label propagation technique originally
developed for graph clustering. By introducing size constraints, label
propagation becomes applicable for both the coarsening and the refinement phase
of multilevel graph partitioning. We obtain very high quality by applying a
highly parallel evolutionary algorithm to the coarsened graph. The resulting
system is both more scalable and achieves higher quality than state-of-the-art
systems like ParMetis or PT-Scotch. For large complex networks the performance
differences are very big. For example, our algorithm can partition a web graph
with 3.3 billion edges in less than sixteen seconds using 512 cores of a high
performance cluster while producing a high quality partition -- none of the
competing systems can handle this graph on our system.Comment: Review article. Parallelization of our previous approach
arXiv:1402.328
Partitioning Complex Networks via Size-constrained Clustering
The most commonly used method to tackle the graph partitioning problem in
practice is the multilevel approach. During a coarsening phase, a multilevel
graph partitioning algorithm reduces the graph size by iteratively contracting
nodes and edges until the graph is small enough to be partitioned by some other
algorithm. A partition of the input graph is then constructed by successively
transferring the solution to the next finer graph and applying a local search
algorithm to improve the current solution.
In this paper, we describe a novel approach to partition graphs effectively
especially if the networks have a highly irregular structure. More precisely,
our algorithm provides graph coarsening by iteratively contracting
size-constrained clusterings that are computed using a label propagation
algorithm. The same algorithm that provides the size-constrained clusterings
can also be used during uncoarsening as a fast and simple local search
algorithm.
Depending on the algorithm's configuration, we are able to compute partitions
of very high quality outperforming all competitors, or partitions that are
comparable to the best competitor in terms of quality, hMetis, while being
nearly an order of magnitude faster on average. The fastest configuration
partitions the largest graph available to us with 3.3 billion edges using a
single machine in about ten minutes while cutting less than half of the edges
than the fastest competitor, kMetis
Planar Ultrametric Rounding for Image Segmentation
We study the problem of hierarchical clustering on planar graphs. We
formulate this in terms of an LP relaxation of ultrametric rounding. To solve
this LP efficiently we introduce a dual cutting plane scheme that uses minimum
cost perfect matching as a subroutine in order to efficiently explore the space
of planar partitions. We apply our algorithm to the problem of hierarchical
image segmentation
- …