Search CORE

1,666 research outputs found

Labeling the Features Not the Samples: Efficient Video Classification with Minimal Supervision

Author: Baluja Shumeet
Leordeanu Marius
Radu Alexandra
Sukthankar Rahul
Publication venue
Publication date: 01/12/2015
Field of study

Feature selection is essential for effective visual recognition. We propose an efficient joint classifier learning and feature selection method that discovers sparse, compact representations of input features from a vast sea of candidates, with an almost unsupervised formulation. Our method requires only the following knowledge, which we call the \emph{feature sign}---whether or not a particular feature has on average stronger values over positive samples than over negatives. We show how this can be estimated using as few as a single labeled training sample per class. Then, using these feature signs, we extend an initial supervised learning problem into an (almost) unsupervised clustering formulation that can incorporate new data without requiring ground truth labels. Our method works both as a feature selection mechanism and as a fully competitive classifier. It has important properties, low computational cost and excellent accuracy, especially in difficult cases of very limited training data. We experiment on large-scale recognition in video and show superior speed and performance to established feature selection approaches such as AdaBoost, Lasso, greedy forward-backward selection, and powerful classifiers such as SVM.Comment: arXiv admin note: text overlap with arXiv:1411.771

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

Multiclass Data Segmentation using Diffuse Interface Methods on Graphs

Author: Bertozzi Andrea L.
Flenner Arjuna
Garcia-Cardona Cristina
Merkurjev Ekaterina
Percus Allon
Publication venue
Publication date: 17/01/2014
Field of study

We present two graph-based algorithms for multiclass segmentation of high-dimensional data. The algorithms use a diffuse interface model based on the Ginzburg-Landau functional, related to total variation compressed sensing and image processing. A multiclass extension is introduced using the Gibbs simplex, with the functional's double-well potential modified to handle the multiclass case. The first algorithm minimizes the functional using a convex splitting numerical scheme. The second algorithm is a uses a graph adaptation of the classical numerical Merriman-Bence-Osher (MBO) scheme, which alternates between diffusion and thresholding. We demonstrate the performance of both algorithms experimentally on synthetic data, grayscale and color images, and several benchmark data sets such as MNIST, COIL and WebKB. We also make use of fast numerical solvers for finding the eigenvectors and eigenvalues of the graph Laplacian, and take advantage of the sparsity of the matrix. Experiments indicate that the results are competitive with or better than the current state-of-the-art multiclass segmentation algorithms.Comment: 14 page

arXiv.org e-Print Archive

CiteSeerX

Crossref

eScholarship - University of California

A Two-stage Classification Method for High-dimensional Data and Point Clouds

Author: Cai Xiaohao
Chan Raymond
Xie Xiaoyu
Zeng Tieyong
Publication venue
Publication date: 21/05/2019
Field of study

High-dimensional data classification is a fundamental task in machine learning and imaging science. In this paper, we propose a two-stage multiphase semi-supervised classification method for classifying high-dimensional data and unstructured point clouds. To begin with, a fuzzy classification method such as the standard support vector machine is used to generate a warm initialization. We then apply a two-stage approach named SaT (smoothing and thresholding) to improve the classification. In the first stage, an unconstraint convex variational model is implemented to purify and smooth the initialization, followed by the second stage which is to project the smoothed partition obtained at stage one to a binary partition. These two stages can be repeated, with the latest result as a new initialization, to keep improving the classification quality. We show that the convex model of the smoothing stage has a unique solution and can be solved by a specifically designed primal-dual algorithm whose convergence is guaranteed. We test our method and compare it with the state-of-the-art methods on several benchmark data sets. The experimental results demonstrate clearly that our method is superior in both the classification accuracy and computation speed for high-dimensional data and point clouds.Comment: 21 pages, 4 figure

arXiv.org e-Print Archive

Southampton (e-Prints Soton)

Semi-supervised segmentation of ultrasound images based on patch representation and continuous min cut.

Author: A Buades
A Fenster
A Lee
Anca Ciurte
Arrate Muñoz-Barrutia
BJ Oosterveld
C Corsi
D Altman
D Donoho
D Shen
DR Chen
E Bae
F Shao
G Strang
G Unal
G Xiao
HC Fledelius
I Sarris
J Bioucas-Dias
J Cheng
J Flusser
J Lellmann
J Revell
J Shi
J Xie
JA Noble
Jean-Philippe Thiran
L Gong
L Grady
LR Dice
M Cvancarova
M Mignotte
Meritxell Bach Cuadra
MJ Ledesma-Carbayo
Nawal Houhou
Olivier Cuisenaire
P Coupé
RF Chang
S Setzer
Sergiu Nedevschi
T Chan
T Goldstein
T Loupas
WL Lee
X Bresson
X Huang
Xavier Bresson
Y Boykov
Y Boykov
Y Chen
Y Pingkun
Z Tao
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2014
Field of study

Ultrasound segmentation is a challenging problem due to the inherent speckle and some artifacts like shadows, attenuation and signal dropout. Existing methods need to include strong priors like shape priors or analytical intensity models to succeed in the segmentation. However, such priors tend to limit these methods to a specific target or imaging settings, and they are not always applicable to pathological cases. This work introduces a semi-supervised segmentation framework for ultrasound imaging that alleviates the limitation of fully automatic segmentation, that is, it is applicable to any kind of target and imaging settings. Our methodology uses a graph of image patches to represent the ultrasound image and user-assisted initialization with labels, which acts as soft priors. The segmentation problem is formulated as a continuous minimum cut problem and solved with an efficient optimization algorithm. We validate our segmentation framework on clinical ultrasound imaging (prostate, fetus, and tumors of the liver and eye). We obtain high similarity agreement with the ground truth provided by medical expert delineations in all applications (94% DICE values in average) and the proposed algorithm performs favorably with the literature

Infoscience - École polytechnique fédérale de Lausanne

CiteSeerX

Crossref

Serveur académique lausannois

Directory of Open Access Journals

PubMed Central

FigShare

Recommended from our members

Sparsity in Machine Learning: An Information Selecting Perspective

Author: Feng Siwei
Publication venue: ScholarWorks@UMass Amherst
Publication date: 02/07/2019
Field of study

Today we are living in a world awash with data. Large volumes of data are acquired, analyzed and applied to tasks through machine learning algorithms in nearly every area of science, business, and industry. For example, medical scientists analyze the gene expression data from a single specimen to learn the underlying causes of disease (e.g. cancer) and choose the best treatment; retailers can know more about customers\u27 shopping habits from retail data to adjust their business strategies to better appeal to customers; suppliers can enhance supply chain success through supply chain systems built on knowledge sharing. However, it is also reasonable to doubt whether all the genes make contributions to a disease; whether all the data obtained from existing customers can be applied to a new customer; whether all shared knowledge in the supply network is useful to a specific supply scenario. Therefore, it is crucial to sort through the massive information provided by data and keep what we really need. This process is referred to as information selection, which keeps the information that helps improve the performance of corresponding machine learning tasks and discards information that is useless or even harmful to task performance. Sparse learning is a powerful tool to achieve information selection. In this thesis, we apply sparse learning to two major areas in machine learning -- feature selection and transfer learning. Feature selection is a dimensionality reduction technique that selects a subset of representative features. Recently, feature selection combined with sparse learning has attracted significant attention due to its outstanding performance compared with traditional feature selection methods that ignore correlation between features. However, they are restricted by design to linear data transformations, a potential drawback given that the underlying correlation structures of data are often non-linear. To leverage more sophisticated embedding than the linear model assumed by sparse learning, we propose an autoencoder-based unsupervised feature selection approach that leverages a single-layer autoencoder for a joint framework of feature selection and manifold learning. Additionally, we include spectral graph analysis on the projected data into the learning process to achieve local data geometry preservation from the original data space to the low-dimensional feature space. Transfer learning describes a set of methods that aim at transferring knowledge from related domains to alleviate the problems caused by limited/no labeled training data in machine learnig tasks. Many transfer learning techniques have been proposed to deal with different application scenarios. However, due to the differences in data distribution, feature space, label space, etc., between source domain and target domain, it is necessary to select and only transfer relevant information from source domain to improve the performance of target learner. Otherwise, the target learner can be negatively impacted by the weak-related knowledge from source domain, which is referred to as negative transfer. In this thesis, we focus on two transfer learning scenarios for which limited labeled training data are available in target domain. In the first scenario, no label information is avaible in source data. In the second scenario, large amounts of labeled source data are available, but there is no overlap between the source and target label spaces. The corresponding transfer learning technique to the former case is called \emph{self-taught learning}, while that for the latter case is called \emph{few-shot learning}. We apply self-taught learning to visual, textal, and audio data. We also apply few-shot learning to wearable sensor based human activity data. For both cases, we propose a metric for the relevance between a target sample/class and a source sample/class, and then extract information from the related samples/classes for knowledge transfer to perform information selection so that negative transfer caused by weakly related source information can be alleviated. Experimental results show that transfer learning can provide better performance with information selection

ScholarWorks@UMass Amherst