28,055 research outputs found
Segmentation Given Partial Grouping Constraints
We consider data clustering problems where partial grouping is known a priori. We formulate such biased grouping problems as a constrained optimization problem, where structural properties of the data define the goodness of a grouping and partial grouping cues define the feasibility of a grouping. We enforce grouping smoothness and fairness on labeled data points so that sparse partial grouping information can be effectively propagated to the unlabeled data. Considering the normalized cuts criterion in particular, our formulation leads to a constrained eigenvalue problem. By generalizing the Rayleigh-Ritz theorem to projected matrices, we find the global optimum in the relaxed continuous domain by eigendecomposition, from which a near-global optimum to the discrete labeling problem can be obtained effectively. We apply our method to real image segmentation problems, where partial grouping priors can often be derived based on a crude spatial attentional map that binds places with common salient features or focuses on expected object locations. We demonstrate not only that it is possible to integrate both image structures and priors in a single grouping process, but also that objects can be segregated from the background without specific object knowledge
Large-scale Binary Quadratic Optimization Using Semidefinite Relaxation and Applications
In computer vision, many problems such as image segmentation, pixel
labelling, and scene parsing can be formulated as binary quadratic programs
(BQPs). For submodular problems, cuts based methods can be employed to
efficiently solve large-scale problems. However, general nonsubmodular problems
are significantly more challenging to solve. Finding a solution when the
problem is of large size to be of practical interest, however, typically
requires relaxation. Two standard relaxation methods are widely used for
solving general BQPs--spectral methods and semidefinite programming (SDP), each
with their own advantages and disadvantages. Spectral relaxation is simple and
easy to implement, but its bound is loose. Semidefinite relaxation has a
tighter bound, but its computational complexity is high, especially for large
scale problems. In this work, we present a new SDP formulation for BQPs, with
two desirable properties. First, it has a similar relaxation bound to
conventional SDP formulations. Second, compared with conventional SDP methods,
the new SDP formulation leads to a significantly more efficient and scalable
dual optimization approach, which has the same degree of complexity as spectral
methods. We then propose two solvers, namely, quasi-Newton and smoothing Newton
methods, for the dual problem. Both of them are significantly more efficiently
than standard interior-point methods. In practice, the smoothing Newton solver
is faster than the quasi-Newton solver for dense or medium-sized problems,
while the quasi-Newton solver is preferable for large sparse/structured
problems. Our experiments on a few computer vision applications including
clustering, image segmentation, co-segmentation and registration show the
potential of our SDP formulation for solving large-scale BQPs.Comment: Fixed some typos. 18 pages. Accepted to IEEE Transactions on Pattern
Analysis and Machine Intelligenc
Semi-supervised cross-entropy clustering with information bottleneck constraint
In this paper, we propose a semi-supervised clustering method, CEC-IB, that
models data with a set of Gaussian distributions and that retrieves clusters
based on a partial labeling provided by the user (partition-level side
information). By combining the ideas from cross-entropy clustering (CEC) with
those from the information bottleneck method (IB), our method trades between
three conflicting goals: the accuracy with which the data set is modeled, the
simplicity of the model, and the consistency of the clustering with side
information. Experiments demonstrate that CEC-IB has a performance comparable
to Gaussian mixture models (GMM) in a classical semi-supervised scenario, but
is faster, more robust to noisy labels, automatically determines the optimal
number of clusters, and performs well when not all classes are present in the
side information. Moreover, in contrast to other semi-supervised models, it can
be successfully applied in discovering natural subgroups if the partition-level
side information is derived from the top levels of a hierarchical clustering
Motion segmentation by consensus
We present a method for merging multiple partitions into a single partition, by minimising the ratio of pairwise agreements and contradictions between the equivalence relations corresponding to the partitions. The number of equivalence classes is determined automatically. This method is advantageous when merging segmentations obtained independently. We propose using this consensus approach to merge segmentations of features tracked on video. Each segmentation is obtained by clustering on the basis of mean velocity during a particular time interva
Team Formation for Scheduling Educational Material in Massive Online Classes
Whether teaching in a classroom or a Massive Online Open Course it is crucial
to present the material in a way that benefits the audience as a whole. We
identify two important tasks to solve towards this objective, 1 group students
so that they can maximally benefit from peer interaction and 2 find an optimal
schedule of the educational material for each group. Thus, in this paper, we
solve the problem of team formation and content scheduling for education. Given
a time frame d, a set of students S with their required need to learn different
activities T and given k as the number of desired groups, we study the problem
of finding k group of students. The goal is to teach students within time frame
d such that their potential for learning is maximized and find the best
schedule for each group. We show this problem to be NP-hard and develop a
polynomial algorithm for it. We show our algorithm to be effective both on
synthetic as well as a real data set. For our experiments, we use real data on
students' grades in a Computer Science department. As part of our contribution,
we release a semi-synthetic dataset that mimics the properties of the real
data
Semi-supervised model-based clustering with controlled clusters leakage
In this paper, we focus on finding clusters in partially categorized data
sets. We propose a semi-supervised version of Gaussian mixture model, called
C3L, which retrieves natural subgroups of given categories. In contrast to
other semi-supervised models, C3L is parametrized by user-defined leakage
level, which controls maximal inconsistency between initial categorization and
resulting clustering. Our method can be implemented as a module in practical
expert systems to detect clusters, which combine expert knowledge with true
distribution of data. Moreover, it can be used for improving the results of
less flexible clustering techniques, such as projection pursuit clustering. The
paper presents extensive theoretical analysis of the model and fast algorithm
for its efficient optimization. Experimental results show that C3L finds high
quality clustering model, which can be applied in discovering meaningful groups
in partially classified data
A Multi-cut Formulation for Joint Segmentation and Tracking of Multiple Objects
Recently, Minimum Cost Multicut Formulations have been proposed and proven to
be successful in both motion trajectory segmentation and multi-target tracking
scenarios. Both tasks benefit from decomposing a graphical model into an
optimal number of connected components based on attractive and repulsive
pairwise terms. The two tasks are formulated on different levels of granularity
and, accordingly, leverage mostly local information for motion segmentation and
mostly high-level information for multi-target tracking. In this paper we argue
that point trajectories and their local relationships can contribute to the
high-level task of multi-target tracking and also argue that high-level cues
from object detection and tracking are helpful to solve motion segmentation. We
propose a joint graphical model for point trajectories and object detections
whose Multicuts are solutions to motion segmentation {\it and} multi-target
tracking problems at once. Results on the FBMS59 motion segmentation benchmark
as well as on pedestrian tracking sequences from the 2D MOT 2015 benchmark
demonstrate the promise of this joint approach
- …