Search CORE

38 research outputs found

Multiclass Semi-Supervised Learning on Graphs using Ginzburg-Landau Functional Minimization

Author: A Bertozzi
A Bertozzi
A Subramanya
AD Szlam
AL Bertozzi
D Zhou
EL Allwein
G Gilboa
GE Hinton
JA Dobrosotskaya
JA Dobrosotskaya
L Zelnik-Manor
RR Coifman
RV Kohn
TG Dietterich
Y LeCun
Y Li
YM Jung
Publication venue
Publication date: 06/06/2013
Field of study

We present a graph-based variational algorithm for classification of high-dimensional data, generalizing the binary diffuse interface model to the case of multiple classes. Motivated by total variation techniques, the method involves minimizing an energy functional made up of three terms. The first two terms promote a stepwise continuous classification function with sharp transitions between classes, while preserving symmetry among the class labels. The third term is a data fidelity term, allowing us to incorporate prior information into the model in a semi-supervised framework. The performance of the algorithm on synthetic data, as well as on the COIL and MNIST benchmark datasets, is competitive with state-of-the-art graph-based multiclass segmentation methods.Comment: 16 pages, to appear in Springer's Lecture Notes in Computer Science volume "Pattern Recognition Applications and Methods 2013", part of series on Advances in Intelligent and Soft Computin

arXiv.org e-Print Archive

Crossref

Multiclass Data Segmentation using Diffuse Interface Methods on Graphs

Author: Bertozzi Andrea L.
Flenner Arjuna
Garcia-Cardona Cristina
Merkurjev Ekaterina
Percus Allon
Publication venue
Publication date: 17/01/2014
Field of study

We present two graph-based algorithms for multiclass segmentation of high-dimensional data. The algorithms use a diffuse interface model based on the Ginzburg-Landau functional, related to total variation compressed sensing and image processing. A multiclass extension is introduced using the Gibbs simplex, with the functional's double-well potential modified to handle the multiclass case. The first algorithm minimizes the functional using a convex splitting numerical scheme. The second algorithm is a uses a graph adaptation of the classical numerical Merriman-Bence-Osher (MBO) scheme, which alternates between diffusion and thresholding. We demonstrate the performance of both algorithms experimentally on synthetic data, grayscale and color images, and several benchmark data sets such as MNIST, COIL and WebKB. We also make use of fast numerical solvers for finding the eigenvectors and eigenvalues of the graph Laplacian, and take advantage of the sparsity of the matrix. Experiments indicate that the results are competitive with or better than the current state-of-the-art multiclass segmentation algorithms.Comment: 14 page

arXiv.org e-Print Archive

CiteSeerX

Crossref

eScholarship - University of California

Simplified Energy Landscape for Modularity Using Total Variation

Author: Bae Egil
Bertozzi Andrea L.
Boyd Zachary
Tai Xue-Cheng
Publication venue
Publication date: 01/01/2018
Field of study

Networks capture pairwise interactions between entities and are frequently used in applications such as social networks, food networks, and protein interaction networks, to name a few. Communities, cohesive groups of nodes, often form in these applications, and identifying them gives insight into the overall organization of the network. One common quality function used to identify community structure is modularity. In Hu et al. [SIAM J. App. Math., 73(6), 2013], it was shown that modularity optimization is equivalent to minimizing a particular nonconvex total variation (TV) based functional over a discrete domain. They solve this problem, assuming the number of communities is known, using a Merriman, Bence, Osher (MBO) scheme. We show that modularity optimization is equivalent to minimizing a convex TV-based functional over a discrete domain, again, assuming the number of communities is known. Furthermore, we show that modularity has no convex relaxation satisfying certain natural conditions. We therefore, find a manageable non-convex approximation using a Ginzburg Landau functional, which provably converges to the correct energy in the limit of a certain parameter. We then derive an MBO algorithm with fewer hand-tuned parameters than in Hu et al. and which is 7 times faster at solving the associated diffusion equation due to the fact that the underlying discretization is unconditionally stable. Our numerical tests include a hyperspectral video whose associated graph has 2.9x10^7 edges, which is roughly 37 times larger than was handled in the paper of Hu et al.Comment: 25 pages, 3 figures, 3 tables, submitted to SIAM J. App. Mat

arXiv.org e-Print Archive

University of Bergen

Crossref

eScholarship - University of California

NORA - Norwegian Open Research Archives

Uncertainty quantification in graph-based classification of high dimensional data

Author: Belkin M.
Belkin M.
Boykov Y.
Boykov Y. Y.
Bühler T.
Dahlhaus E.
Hu H.
Kapoor A.
Madry A.
Merkurjev E.
Neal R.
Subramanya A.
Talukdar P.
Van Gennip Y.
Williams C. K.
Williams C. K.
Zelnik-Manor L.
Zhou D.
Zhu X.
Publication venue: 'Society for Industrial & Applied Mathematics (SIAM)'
Publication date: 01/01/2018
Field of study

Classification of high dimensional data finds wide-ranging applications. In many of these applications equipping the resulting classification with a measure of uncertainty may be as important as the classification itself. In this paper we introduce, develop algorithms for, and investigate the properties of, a variety of Bayesian models for the task of binary classification; via the posterior distribution on the classification labels, these methods automatically give measures of uncertainty. The methods are all based around the graph formulation of semi-supervised learning. We provide a unified framework which brings together a variety of methods which have been introduced in different communities within the mathematical sciences. We study probit classification in the graph-based setting, generalize the level-set method for Bayesian inverse problems to the classification setting, and generalize the Ginzburg-Landau optimization-based classifier to a Bayesian setting; we also show that the probit and level set approaches are natural relaxations of the harmonic function approach introduced in [Zhu et al 2003]. We introduce efficient numerical methods, suited to large data-sets, for both MCMC-based sampling as well as gradient-based MAP estimation. Through numerical experiments we study classification accuracy and uncertainty quantification for our models; these experiments showcase a suite of datasets commonly used to evaluate graph-based semi-supervised learning algorithms.Comment: 33 pages, 14 figure

arXiv.org e-Print Archive

Crossref

Edinburgh Research Explorer

eScholarship - University of California

Caltech Authors

Diffuse interface methods for multiclass segmentation of high-dimensional data

Author: Allon G. Percus
Anderson
Andrea L. Bertozzi
Anguita
Arjuna Flenner
Barles
Bertozzi
Bertozzi
Cristina Garcia-Cardona
Dobrosotskaya
Ekaterina Merkurjev
Esedog¯lu
Evans
Garcia-Cardona
Gilboa
Kohn
Lellmann
Merkurjev
Merriman
Mroueh
Nene
Perona
Rubinstein
Subramanya
Yuille
Publication venue: 'Elsevier BV'
Publication date
Field of study

Crossref