34 research outputs found
Second order cone programming approaches for handling missing and uncertain data
We propose a novel second order cone programming formulation for designing robust classifiers
which can handle uncertainty in observations. Similar formulations are also derived for designing
regression functions which are robust to uncertainties in the regression setting. The proposed formulations
are independent of the underlying distribution, requiring only the existence of second order
moments. These formulations are then specialized to the case of missing values in observations
for both classification and regression problems. Experiments show that the proposed formulations
outperform imputation
Learning from Distributions via Support Measure Machines
This paper presents a kernel-based discriminative learning framework on
probability measures. Rather than relying on large collections of vectorial
training examples, our framework learns using a collection of probability
distributions that have been constructed to meaningfully represent training
data. By representing these probability distributions as mean embeddings in the
reproducing kernel Hilbert space (RKHS), we are able to apply many standard
kernel-based learning techniques in straightforward fashion. To accomplish
this, we construct a generalization of the support vector machine (SVM) called
a support measure machine (SMM). Our analyses of SMMs provides several insights
into their relationship to traditional SVMs. Based on such insights, we propose
a flexible SVM (Flex-SVM) that places different kernel functions on each
training example. Experimental results on both synthetic and real-world data
demonstrate the effectiveness of our proposed framework.Comment: Advances in Neural Information Processing Systems 2
Robustness and Generalization
We derive generalization bounds for learning algorithms based on their
robustness: the property that if a testing sample is "similar" to a training
sample, then the testing error is close to the training error. This provides a
novel approach, different from the complexity or stability arguments, to study
generalization of learning algorithms. We further show that a weak notion of
robustness is both sufficient and necessary for generalizability, which implies
that robustness is a fundamental property for learning algorithms to work
Early identification of mild cognitive impairment using incomplete random forest-robust support vector machine and FDG-PET imaging
Alzheimer’s disease (AD) is the most common type of dementia and will be an increasing health problem in society as the population ages. Mild cognitive impairment (MCI) is considered to be a prodromal stage of AD. The ability to identify subjects with MCI will be increasingly important as disease modifying therapies for AD are developed. We propose a semi-supervised learning method based on robust optimization for the identification of MCI from [18F]Fluorodeoxyglucose PET scans. We extracted three groups of spatial features from the cortical and subcortical regions of each FDG-PET image volume. We measured the statistical uncertainty related to these spatial features via transformation using an incomplete random forest and formulated the MCI identification problem under a robust optimization framework. We compared our approach to other state-of-the-art methods in different learning schemas. Our method outperformed the other techniques in the ability to separate MCI from normal controls