22,531 research outputs found
Multiclass Total Variation Clustering
Ideas from the image processing literature have recently motivated a new set
of clustering algorithms that rely on the concept of total variation. While
these algorithms perform well for bi-partitioning tasks, their recursive
extensions yield unimpressive results for multiclass clustering tasks. This
paper presents a general framework for multiclass total variation clustering
that does not rely on recursion. The results greatly outperform previous total
variation algorithms and compare well with state-of-the-art NMF approaches
A Method Based on Total Variation for Network Modularity Optimization using the MBO Scheme
The study of network structure is pervasive in sociology, biology, computer
science, and many other disciplines. One of the most important areas of network
science is the algorithmic detection of cohesive groups of nodes called
"communities". One popular approach to find communities is to maximize a
quality function known as {\em modularity} to achieve some sort of optimal
clustering of nodes. In this paper, we interpret the modularity function from a
novel perspective: we reformulate modularity optimization as a minimization
problem of an energy functional that consists of a total variation term and an
balance term. By employing numerical techniques from image processing
and compressive sensing -- such as convex splitting and the
Merriman-Bence-Osher (MBO) scheme -- we develop a variational algorithm for the
minimization problem. We present our computational results using both synthetic
benchmark networks and real data.Comment: 23 page
Network Lasso: Clustering and Optimization in Large Graphs
Convex optimization is an essential tool for modern data analysis, as it
provides a framework to formulate and solve many problems in machine learning
and data mining. However, general convex optimization solvers do not scale
well, and scalable solvers are often specialized to only work on a narrow class
of problems. Therefore, there is a need for simple, scalable algorithms that
can solve many common optimization problems. In this paper, we introduce the
\emph{network lasso}, a generalization of the group lasso to a network setting
that allows for simultaneous clustering and optimization on graphs. We develop
an algorithm based on the Alternating Direction Method of Multipliers (ADMM) to
solve this problem in a distributed and scalable manner, which allows for
guaranteed global convergence even on large graphs. We also examine a
non-convex extension of this approach. We then demonstrate that many types of
problems can be expressed in our framework. We focus on three in particular -
binary classification, predicting housing prices, and event detection in time
series data - comparing the network lasso to baseline approaches and showing
that it is both a fast and accurate method of solving large optimization
problems
Multiclass Data Segmentation using Diffuse Interface Methods on Graphs
We present two graph-based algorithms for multiclass segmentation of
high-dimensional data. The algorithms use a diffuse interface model based on
the Ginzburg-Landau functional, related to total variation compressed sensing
and image processing. A multiclass extension is introduced using the Gibbs
simplex, with the functional's double-well potential modified to handle the
multiclass case. The first algorithm minimizes the functional using a convex
splitting numerical scheme. The second algorithm is a uses a graph adaptation
of the classical numerical Merriman-Bence-Osher (MBO) scheme, which alternates
between diffusion and thresholding. We demonstrate the performance of both
algorithms experimentally on synthetic data, grayscale and color images, and
several benchmark data sets such as MNIST, COIL and WebKB. We also make use of
fast numerical solvers for finding the eigenvectors and eigenvalues of the
graph Laplacian, and take advantage of the sparsity of the matrix. Experiments
indicate that the results are competitive with or better than the current
state-of-the-art multiclass segmentation algorithms.Comment: 14 page
Combinatorial Continuous Maximal Flows
Maximum flow (and minimum cut) algorithms have had a strong impact on
computer vision. In particular, graph cuts algorithms provide a mechanism for
the discrete optimization of an energy functional which has been used in a
variety of applications such as image segmentation, stereo, image stitching and
texture synthesis. Algorithms based on the classical formulation of max-flow
defined on a graph are known to exhibit metrication artefacts in the solution.
Therefore, a recent trend has been to instead employ a spatially continuous
maximum flow (or the dual min-cut problem) in these same applications to
produce solutions with no metrication errors. However, known fast continuous
max-flow algorithms have no stopping criteria or have not been proved to
converge. In this work, we revisit the continuous max-flow problem and show
that the analogous discrete formulation is different from the classical
max-flow problem. We then apply an appropriate combinatorial optimization
technique to this combinatorial continuous max-flow CCMF problem to find a
null-divergence solution that exhibits no metrication artefacts and may be
solved exactly by a fast, efficient algorithm with provable convergence.
Finally, by exhibiting the dual problem of our CCMF formulation, we clarify the
fact, already proved by Nozawa in the continuous setting, that the max-flow and
the total variation problems are not always equivalent.Comment: 26 page
Tight Continuous Relaxation of the Balanced -Cut Problem
Spectral Clustering as a relaxation of the normalized/ratio cut has become
one of the standard graph-based clustering methods. Existing methods for the
computation of multiple clusters, corresponding to a balanced -cut of the
graph, are either based on greedy techniques or heuristics which have weak
connection to the original motivation of minimizing the normalized cut. In this
paper we propose a new tight continuous relaxation for any balanced -cut
problem and show that a related recently proposed relaxation is in most cases
loose leading to poor performance in practice. For the optimization of our
tight continuous relaxation we propose a new algorithm for the difficult
sum-of-ratios minimization problem which achieves monotonic descent. Extensive
comparisons show that our method outperforms all existing approaches for ratio
cut and other balanced -cut criteria.Comment: Long version of paper accepted at NIPS 201
- …