3,505 research outputs found
Unsupervised Image Embedding Using Nonparametric Statistics
Embedding images into a low dimensional space has a wide range of applications: visualization, clustering, and pre-processing for supervised learning. Traditional dimension reduction algorithms assume that the examples densely populate the manifold. Image databases tend to break this assumption, having isolated islands of similar images instead. In this work, we propose a novel approach that embeds images into a low dimensional Euclidean space, while preserving local image similarities based on their scale invariant feature transform (SIFT) vectors. We make no neighborhood assumptions in our embedding. Our algorithm can also embed the images in a discrete grid, useful for many visualization tasks. We demonstrate the algorithm on images with known categories and compare our accuracy favorably to those of competing algorithms.
Coarse-to-Fine Classification via Parametric and Nonparametric Models for Computer-Aided Diagnosis
Classification is one of the core problems in Computer-Aided Diagnosis (CAD),
targeting for early cancer detection using 3D medical imaging interpretation.
High detection sensitivity with desirably low false positive (FP) rate is
critical for a CAD system to be accepted as a valuable or even indispensable
tool in radiologists' workflow. Given various spurious imagery noises which
cause observation uncertainties, this remains a very challenging task. In this
paper, we propose a novel, two-tiered coarse-to-fine (CTF) classification
cascade framework to tackle this problem. We first obtain
classification-critical data samples (e.g., samples on the decision boundary)
extracted from the holistic data distributions using a robust parametric model
(e.g., \cite{Raykar08}); then we build a graph-embedding based nonparametric
classifier on sampled data, which can more accurately preserve or formulate the
complex classification boundary. These two steps can also be considered as
effective "sample pruning" and "feature pursuing + NN/template matching",
respectively. Our approach is validated comprehensively in colorectal polyp
detection and lung nodule detection CAD systems, as the top two deadly cancers,
using hospital scale, multi-site clinical datasets. The results show that our
method achieves overall better classification/detection performance than
existing state-of-the-art algorithms using single-layer classifiers, such as
the support vector machine variants \cite{Wang08}, boosting \cite{Slabaugh10},
logistic regression \cite{Ravesteijn10}, relevance vector machine
\cite{Raykar08}, -nearest neighbor \cite{Murphy09} or spectral projections
on graph \cite{Cai08}
A spatio-spectral hybridization for edge preservation and noisy image restoration via local parametric mixtures and Lagrangian relaxation
This paper investigates a fully unsupervised statistical method for edge
preserving image restoration and compression using a spatial decomposition
scheme. Smoothed maximum likelihood is used for local estimation of edge pixels
from mixture parametric models of local templates. For the complementary smooth
part the traditional L2-variational problem is solved in the Fourier domain
with Thin Plate Spline (TPS) regularization. It is well known that naive
Fourier compression of the whole image fails to restore a piece-wise smooth
noisy image satisfactorily due to Gibbs phenomenon. Images are interpreted as
relative frequency histograms of samples from bi-variate densities where the
sample sizes might be unknown. The set of discontinuities is assumed to be
completely unsupervised Lebesgue-null, compact subset of the plane in the
continuous formulation of the problem. Proposed spatial decomposition uses a
widely used topological concept, partition of unity. The decision on edge pixel
neighborhoods are made based on the multiple testing procedure of Holms.
Statistical summary of the final output is decomposed into two layers of
information extraction, one for the subset of edge pixels and the other for the
smooth region. Robustness is also demonstrated by applying the technique on
noisy degradation of clean images.Comment: 29 Pages, 13 figure
Deep Transfer Learning with Joint Adaptation Networks
Deep networks have been successfully applied to learn transferable features
for adapting models from a source domain to a different target domain. In this
paper, we present joint adaptation networks (JAN), which learn a transfer
network by aligning the joint distributions of multiple domain-specific layers
across domains based on a joint maximum mean discrepancy (JMMD) criterion.
Adversarial training strategy is adopted to maximize JMMD such that the
distributions of the source and target domains are made more distinguishable.
Learning can be performed by stochastic gradient descent with the gradients
computed by back-propagation in linear-time. Experiments testify that our model
yields state of the art results on standard datasets.Comment: 34th International Conference on Machine Learnin
Kernels on Sample Sets via Nonparametric Divergence Estimates
Most machine learning algorithms, such as classification or regression, treat
the individual data point as the object of interest. Here we consider extending
machine learning algorithms to operate on groups of data points. We suggest
treating a group of data points as an i.i.d. sample set from an underlying
feature distribution for that group. Our approach employs kernel machines with
a kernel on i.i.d. sample sets of vectors. We define certain kernel functions
on pairs of distributions, and then use a nonparametric estimator to
consistently estimate those functions based on sample sets. The projection of
the estimated Gram matrix to the cone of symmetric positive semi-definite
matrices enables us to use kernel machines for classification, regression,
anomaly detection, and low-dimensional embedding in the space of distributions.
We present several numerical experiments both on real and simulated datasets to
demonstrate the advantages of our new approach.Comment: Substantially updated version as submitted to T-PAMI. 15 pages
including appendi
Deep Mean Maps
The use of distributions and high-level features from deep architecture has
become commonplace in modern computer vision. Both of these methodologies have
separately achieved a great deal of success in many computer vision tasks.
However, there has been little work attempting to leverage the power of these
to methodologies jointly. To this end, this paper presents the Deep Mean Maps
(DMMs) framework, a novel family of methods to non-parametrically represent
distributions of features in convolutional neural network models.
DMMs are able to both classify images using the distribution of top-level
features, and to tune the top-level features for performing this task. We show
how to implement DMMs using a special mean map layer composed of typical CNN
operations, making both forward and backward propagation simple.
We illustrate the efficacy of DMMs at analyzing distributional patterns in
image data in a synthetic data experiment. We also show that we extending
existing deep architectures with DMMs improves the performance of existing CNNs
on several challenging real-world datasets
Unsupervised Detection of Distinctive Regions on 3D Shapes
This paper presents a novel approach to learn and detect distinctive regions
on 3D shapes. Unlike previous works, which require labeled data, our method is
unsupervised. We conduct the analysis on point sets sampled from 3D shapes,
then formulate and train a deep neural network for an unsupervised shape
clustering task to learn local and global features for distinguishing shapes
with respect to a given shape set. To drive the network to learn in an
unsupervised manner, we design a clustering-based nonparametric softmax
classifier with an iterative re-clustering of shapes, and an adapted
contrastive loss for enhancing the feature embedding quality and stabilizing
the learning process. By then, we encourage the network to learn the point
distinctiveness on the input shapes. We extensively evaluate various aspects of
our approach and present its applications for distinctiveness-guided shape
retrieval, sampling, and view selection in 3D scenes.Comment: Accepted by ACM TO
Learning Robust Visual-Semantic Embeddings
Many of the existing methods for learning joint embedding of images and text
use only supervised information from paired images and its textual attributes.
Taking advantage of the recent success of unsupervised learning in deep neural
networks, we propose an end-to-end learning framework that is able to extract
more robust multi-modal representations across domains. The proposed method
combines representation learning models (i.e., auto-encoders) together with
cross-domain learning criteria (i.e., Maximum Mean Discrepancy loss) to learn
joint embeddings for semantic and visual features. A novel technique of
unsupervised-data adaptation inference is introduced to construct more
comprehensive embeddings for both labeled and unlabeled data. We evaluate our
method on Animals with Attributes and Caltech-UCSD Birds 200-2011 dataset with
a wide range of applications, including zero and few-shot image recognition and
retrieval, from inductive to transductive settings. Empirically, we show that
our framework improves over the current state of the art on many of the
considered tasks.Comment: 12 page
Geodesic Clustering in Deep Generative Models
Deep generative models are tremendously successful in learning
low-dimensional latent representations that well-describe the data. These
representations, however, tend to much distort relationships between points,
i.e. pairwise distances tend to not reflect semantic similarities well. This
renders unsupervised tasks, such as clustering, difficult when working with the
latent representations. We demonstrate that taking the geometry of the
generative model into account is sufficient to make simple clustering
algorithms work well over latent representations. Leaning on the recent finding
that deep generative models constitute stochastically immersed Riemannian
manifolds, we propose an efficient algorithm for computing geodesics (shortest
paths) and computing distances in the latent space, while taking its distortion
into account. We further propose a new architecture for modeling uncertainty in
variational autoencoders, which is essential for understanding the geometry of
deep generative models. Experiments show that the geodesic distance is very
likely to reflect the internal structure of the data
Scene Parsing with Global Context Embedding
We present a scene parsing method that utilizes global context information
based on both the parametric and non- parametric models. Compared to previous
methods that only exploit the local relationship between objects, we train a
context network based on scene similarities to generate feature representations
for global contexts. In addition, these learned features are utilized to
generate global and spatial priors for explicit classes inference. We then
design modules to embed the feature representations and the priors into the
segmentation network as additional global context cues. We show that the
proposed method can eliminate false positives that are not compatible with the
global context representations. Experiments on both the MIT ADE20K and PASCAL
Context datasets show that the proposed method performs favorably against
existing methods.Comment: Accepted in ICCV'17. Code available at
https://github.com/hfslyc/GCPNe
- …