2,007 research outputs found
Manifold regularized kernel logistic regression for web image annotation
With the rapid advance of Internet technology and smart devices, users often
need to manage large amounts of multimedia information using smart devices,
such as personal image and video accessing and browsing. These requirements
heavily rely on the success of image (video) annotation, and thus large scale
image annotation through innovative machine learning methods has attracted
intensive attention in recent years. One representative work is support vector
machine (SVM). Although it works well in binary classification, SVM has a
non-smooth loss function and can not naturally cover multi-class case. In this
paper, we propose manifold regularized kernel logistic regression (KLR) for web
image annotation. Compared to SVM, KLR has the following advantages: (1) the
KLR has a smooth loss function; (2) the KLR produces an explicit estimate of
the probability instead of class label; and (3) the KLR can naturally be
generalized to the multi-class case. We carefully conduct experiments on MIR
FLICKR dataset and demonstrate the effectiveness of manifold regularized kernel
logistic regression for image annotation.Comment: submitted to Neurocomputin
Hypergraph p-Laplacian Regularization for Remote Sensing Image Recognition
It is of great importance to preserve locality and similarity information in
semi-supervised learning (SSL) based applications. Graph based SSL and manifold
regularization based SSL including Laplacian regularization (LapR) and
Hypergraph Laplacian regularization (HLapR) are representative SSL methods and
have achieved prominent performance by exploiting the relationship of sample
distribution. However, it is still a great challenge to exactly explore and
exploit the local structure of the data distribution. In this paper, we present
an effect and effective approximation algorithm of Hypergraph p-Laplacian and
then propose Hypergraph p-Laplacian regularization (HpLapR) to preserve the
geometry of the probability distribution. In particular, p-Laplacian is a
nonlinear generalization of the standard graph Laplacian and Hypergraph is a
generalization of a standard graph. Therefore, the proposed HpLapR provides
more potential to exploiting the local structure preserving. We apply HpLapR to
logistic regression and conduct the implementations for remote sensing image
recognition. We compare the proposed HpLapR to several popular manifold
regularization based SSL methods including LapR, HLapR and HpLapR on UC-Merced
dataset. The experimental results demonstrate the superiority of the proposed
HpLapR.Comment: 9 pages, 6 figure
Graph Regularized Low Rank Representation for Aerosol Optical Depth Retrieval
In this paper, we propose a novel data-driven regression model for aerosol
optical depth (AOD) retrieval. First, we adopt a low rank representation (LRR)
model to learn a powerful representation of the spectral response. Then, graph
regularization is incorporated into the LRR model to capture the local
structure information and the nonlinear property of the remote-sensing data.
Since it is easy to acquire the rich satellite-retrieval results, we use them
as a baseline to construct the graph. Finally, the learned feature
representation is feeded into support vector machine (SVM) to retrieve AOD.
Experiments are conducted on two widely used data sets acquired by different
sensors, and the experimental results show that the proposed method can achieve
superior performance compared to the physical models and other state-of-the-art
empirical models.Comment: 16 pages, 6 figure
Ensemble p-Laplacian Regularization for Remote Sensing Image Recognition
Recently, manifold regularized semi-supervised learning (MRSSL) received
considerable attention because it successfully exploits the geometry of the
intrinsic data probability distribution including both labeled and unlabeled
samples to leverage the performance of a learning model. As a natural nonlinear
generalization of graph Laplacian, p-Laplacian has been proved having the rich
theoretical foundations to better preserve the local structure. However, it is
difficult to determine the fitting graph p-Lapalcian i.e. the parameter which
is a critical factor for the performance of graph p-Laplacian. Therefore, we
develop an ensemble p-Laplacian regularization (EpLapR) to fully approximate
the intrinsic manifold of the data distribution. EpLapR incorporates multiple
graphs into a regularization term in order to sufficiently explore the
complementation of graph p-Laplacian. Specifically, we construct a fused graph
by introducing an optimization approach to assign suitable weights on different
p-value graphs. And then, we conduct semi-supervised learning framework on the
fused graph. Extensive experiments on UC-Merced data set demonstrate the
effectiveness and efficiency of the proposed method.Comment: 13 pages, 7 figures. arXiv admin note: text overlap with
arXiv:1806.0810
When coding meets ranking: A joint framework based on local learning
Sparse coding, which represents a data point as a sparse reconstruction code
with regard to a dictionary, has been a popular data representation method.
Meanwhile, in database retrieval problems, learning the ranking scores from
data points plays an important role. Up to now, these two problems have always
been considered separately, assuming that data coding and ranking are two
independent and irrelevant problems. However, is there any internal
relationship between sparse coding and ranking score learning? If yes, how to
explore and make use of this internal relationship? In this paper, we try to
answer these questions by developing the first joint sparse coding and ranking
score learning algorithm. To explore the local distribution in the sparse code
space, and also to bridge coding and ranking problems, we assume that in the
neighborhood of each data point, the ranking scores can be approximated from
the corresponding sparse codes by a local linear function. By considering the
local approximation error of ranking scores, the reconstruction error and
sparsity of sparse coding, and the query information provided by the user, we
construct a unified objective function for learning of sparse codes, the
dictionary and ranking scores. We further develop an iterative algorithm to
solve this optimization problem
Auxiliary Image Regularization for Deep CNNs with Noisy Labels
Precisely-labeled data sets with sufficient amount of samples are very
important for training deep convolutional neural networks (CNNs). However, many
of the available real-world data sets contain erroneously labeled samples and
those errors substantially hinder the learning of very accurate CNN models. In
this work, we consider the problem of training a deep CNN model for image
classification with mislabeled training samples - an issue that is common in
real image data sets with tags supplied by amateur users. To solve this
problem, we propose an auxiliary image regularization technique, optimized by
the stochastic Alternating Direction Method of Multipliers (ADMM) algorithm,
that automatically exploits the mutual context information among training
images and encourages the model to select reliable images to robustify the
learning process. Comprehensive experiments on benchmark data sets clearly
demonstrate our proposed regularized CNN model is resistant to label noise in
training data.Comment: Published as a conference paper at ICLR 201
Multi-modal Face Pose Estimation with Multi-task Manifold Deep Learning
Human face pose estimation aims at estimating the gazing direction or head
postures with 2D images. It gives some very important information such as
communicative gestures, saliency detection and so on, which attracts plenty of
attention recently. However, it is challenging because of complex background,
various orientations and face appearance visibility. Therefore, a descriptive
representation of face images and mapping it to poses are critical. In this
paper, we make use of multi-modal data and propose a novel face pose estimation
method that uses a novel deep learning framework named Multi-task Manifold Deep
Learning . It is based on feature extraction with improved deep neural
networks and multi-modal mapping relationship with multi-task learning. In the
proposed deep learning based framework, Manifold Regularized Convolutional
Layers (MRCL) improve traditional convolutional layers by learning the
relationship among outputs of neurons. Besides, in the proposed mapping
relationship learning method, different modals of face representations are
naturally combined to learn the mapping function from face images to poses. In
this way, the computed mapping model with multiple tasks is improved.
Experimental results on three challenging benchmark datasets DPOSE, HPID and
BKHPD demonstrate the outstanding performance of
Semi-supervised Learning on Graph with an Alternating Diffusion Process
Graph-based semi-supervised learning usually involves two separate stages,
constructing an affinity graph and then propagating labels for transductive
inference on the graph. It is suboptimal to solve them independently, as the
correlation between the affinity graph and labels are not fully exploited. In
this paper, we integrate the two stages into one unified framework by
formulating the graph construction as a regularized function estimation problem
similar to label propagation. We propose an alternating diffusion process to
solve the two problems simultaneously, which allows us to learn the graph and
unknown labels in an iterative fashion. With the proposed framework, we are
able to adequately leverage both the given labels and estimated labels to
construct a better graph, and effectively propagate labels on such a dynamic
graph updated simultaneously with the newly obtained labels. Extensive
experiments on various real-world datasets have demonstrated the superiority of
the proposed method compared to other state-of-the-art methods.Comment: 7 pages, 2 figures, 2 table
Coarse-to-Fine Classification via Parametric and Nonparametric Models for Computer-Aided Diagnosis
Classification is one of the core problems in Computer-Aided Diagnosis (CAD),
targeting for early cancer detection using 3D medical imaging interpretation.
High detection sensitivity with desirably low false positive (FP) rate is
critical for a CAD system to be accepted as a valuable or even indispensable
tool in radiologists' workflow. Given various spurious imagery noises which
cause observation uncertainties, this remains a very challenging task. In this
paper, we propose a novel, two-tiered coarse-to-fine (CTF) classification
cascade framework to tackle this problem. We first obtain
classification-critical data samples (e.g., samples on the decision boundary)
extracted from the holistic data distributions using a robust parametric model
(e.g., \cite{Raykar08}); then we build a graph-embedding based nonparametric
classifier on sampled data, which can more accurately preserve or formulate the
complex classification boundary. These two steps can also be considered as
effective "sample pruning" and "feature pursuing + NN/template matching",
respectively. Our approach is validated comprehensively in colorectal polyp
detection and lung nodule detection CAD systems, as the top two deadly cancers,
using hospital scale, multi-site clinical datasets. The results show that our
method achieves overall better classification/detection performance than
existing state-of-the-art algorithms using single-layer classifiers, such as
the support vector machine variants \cite{Wang08}, boosting \cite{Slabaugh10},
logistic regression \cite{Ravesteijn10}, relevance vector machine
\cite{Raykar08}, -nearest neighbor \cite{Murphy09} or spectral projections
on graph \cite{Cai08}
Learning Structured Ordinal Measures for Video based Face Recognition
This paper presents a structured ordinal measure method for video-based face
recognition that simultaneously learns ordinal filters and structured ordinal
features. The problem is posed as a non-convex integer program problem that
includes two parts. The first part learns stable ordinal filters to project
video data into a large-margin ordinal space. The second seeks self-correcting
and discrete codes by balancing the projected data and a rank-one ordinal
matrix in a structured low-rank way. Unsupervised and supervised structures are
considered for the ordinal matrix. In addition, as a complement to hierarchical
structures, deep feature representations are integrated into our method to
enhance coding stability. An alternating minimization method is employed to
handle the discrete and low-rank constraints, yielding high-quality codes that
capture prior structures well. Experimental results on three commonly used face
video databases show that our method with a simple voting classifier can
achieve state-of-the-art recognition rates using fewer features and samples
- …