39,377 research outputs found
Lipschitz Parametrization of Probabilistic Graphical Models
We show that the log-likelihood of several probabilistic graphical models is
Lipschitz continuous with respect to the lp-norm of the parameters. We discuss
several implications of Lipschitz parametrization. We present an upper bound of
the Kullback-Leibler divergence that allows understanding methods that penalize
the lp-norm of differences of parameters as the minimization of that upper
bound. The expected log-likelihood is lower bounded by the negative lp-norm,
which allows understanding the generalization ability of probabilistic models.
The exponential of the negative lp-norm is involved in the lower bound of the
Bayes error rate, which shows that it is reasonable to use parameters as
features in algorithms that rely on metric spaces (e.g. classification,
dimensionality reduction, clustering). Our results do not rely on specific
algorithms for learning the structure or parameters. We show preliminary
results for activity recognition and temporal segmentation
Classification using log Gaussian Cox processes
McCullagh and Yang (2006) suggest a family of classification algorithms based
on Cox processes. We further investigate the log Gaussian variant which has a
number of appealing properties. Conditioned on the covariates, the distribution
over labels is given by a type of conditional Markov random field. In the
supervised case, computation of the predictive probability of a single test
point scales linearly with the number of training points and the multiclass
generalization is straightforward. We show new links between the supervised
method and classical nonparametric methods. We give a detailed analysis of the
pairwise graph representable Markov random field, which we use to extend the
model to semi-supervised learning problems, and propose an inference method
based on graph min-cuts. We give the first experimental analysis on supervised
and semi-supervised datasets and show good empirical performance.Comment: 17 pages, 6 figure
Generative Adversarial Networks and Conditional Random Fields for Hyperspectral Image Classification
In this paper, we address the hyperspectral image (HSI) classification task
with a generative adversarial network and conditional random field (GAN-CRF)
-based framework, which integrates a semi-supervised deep learning and a
probabilistic graphical model, and make three contributions. First, we design
four types of convolutional and transposed convolutional layers that consider
the characteristics of HSIs to help with extracting discriminative features
from limited numbers of labeled HSI samples. Second, we construct
semi-supervised GANs to alleviate the shortage of training samples by adding
labels to them and implicitly reconstructing real HSI data distribution through
adversarial training. Third, we build dense conditional random fields (CRFs) on
top of the random variables that are initialized to the softmax predictions of
the trained GANs and are conditioned on HSIs to refine classification maps.
This semi-supervised framework leverages the merits of discriminative and
generative models through a game-theoretical approach. Moreover, even though we
used very small numbers of labeled training HSI samples from the two most
challenging and extensively studied datasets, the experimental results
demonstrated that spectral-spatial GAN-CRF (SS-GAN-CRF) models achieved
top-ranking accuracy for semi-supervised HSI classification.Comment: Accepted by IEEE T-CY
The Nataf-Beta Random Field Classifier: An Extension of the Beta Conjugate Prior to Classification Problems
This paper presents the Nataf-Beta Random Field Classifier, a discriminative
approach that extends the applicability of the Beta conjugate prior to
classification problems. The approach's key feature is to model the probability
of a class conditional on attribute values as a random field whose marginals
are Beta distributed, and where the parameters of marginals are themselves
described by random fields. Although the classification accuracy of the
approach proposed does not statistically outperform the best accuracies
reported in the literature, it ranks among the top tier for the six benchmark
datasets tested. The Nataf-Beta Random Field Classifier is suited as a general
purpose classification approach for real-continuous and real-integer attribute
value problems.Comment: 17 pages, 4 figures, Submitted for publication in the Journal of
Machine Learning Researc
A Large Scale Spatio-temporal Binomial Regression Model for Estimating Seroprevalence Trends
This paper develops a large-scale Bayesian spatio-temporal binomial
regression model for the purpose of investigating regional trends in antibody
prevalence to Borrelia burgdorferi, the causative agent of Lyme disease. The
proposed model uses Gaussian predictive processes to estimate the spatially
varying trends and a conditional autoregressive model to account for
spatio-temporal dependence. Careful consideration is made to develop a novel
framework that is scalable to large spatio-temporal data. The proposed model is
used to analyze approximately 16 million Borrelia burgdorferi test results
collected on dogs located throughout the conterminous United States over a
sixty month period. This analysis identifies several regions of increasing
canine risk. Specifically, this analysis reveals evidence that Lyme disease is
getting worse in some endemic regions and that it could potentially be
spreading to other non-endemic areas. Further, given the zoonotic nature of
this vector-borne disease, this analysis could potentially reveal areas of
increasing human risk.Comment: 19 pages without figures. All figures are available as ancillary
file
On the initial shear field of the cosmic web
The initial shear field, characterized by a primordial perturbation
potential, plays a crucial role in the formation of large scale structures.
Hence, considerable analytic work has been based on the joint distribution of
its eigenvalues, associated with Gaussian statistics. In addition, directly
related morphological quantities such as ellipticity or prolateness are
essential tools in understanding the formation and structural properties of
halos, voids, sheets and filaments, their relation with the local environment,
and the geometrical and dynamical classification of the cosmic web. To date,
most analytic work has been focused on Doroshkevich's unconditional formulae
for the eigenvalues of the linear tidal field, which neglect the fact that
halos (voids) may correspond to maxima (minima) of the density field. I present
here new formulae for the constrained eigenvalues of the initial shear field
associated with Gaussian statistics, which include the fact that those
eigenvalues are related to regions where the source of the displacement is
positive (negative): this is achieved by requiring the Hessian matrix of the
displacement field to be positive (negative) definite. The new conditional
formulae naturally reduce to Doroshkevich's unconditional relations, in the
limit of no correlation between the potential and the density fields. As a
direct application, I derive the individual conditional distributions of
eigenvalues and point out the connection with previous literature. Finally, I
outline other possible theoretically- or observationally-oriented uses, ranging
from studies of halo and void triaxial formation, development of
structure-finding algorithms for the morphology and topology of the cosmic web,
till an accurate mapping of the gravitational potential environment of galaxies
from current and future generation galaxy redshift surveys.Comment: 12 pages, 3 figures. MNRAS in pres
Exponential Families for Conditional Random Fields
In this paper we de ne conditional random elds in reproducing kernel Hilbert
spaces and show connections to Gaussian Process classi cation. More speci
cally, we prove decomposition results for undirected graphical models and we
give constructions for kernels. Finally we present e cient means of solving the
optimization problem using reduced rank decompositions and we show how
stationarity can be exploited e ciently in the optimization process.Comment: Appears in Proceedings of the Twentieth Conference on Uncertainty in
Artificial Intelligence (UAI2004
Machine learning based hyperspectral image analysis: A survey
Hyperspectral sensors enable the study of the chemical properties of scene
materials remotely for the purpose of identification, detection, and chemical
composition analysis of objects in the environment. Hence, hyperspectral images
captured from earth observing satellites and aircraft have been increasingly
important in agriculture, environmental monitoring, urban planning, mining, and
defense. Machine learning algorithms due to their outstanding predictive power
have become a key tool for modern hyperspectral image analysis. Therefore, a
solid understanding of machine learning techniques have become essential for
remote sensing researchers and practitioners. This paper reviews and compares
recent machine learning-based hyperspectral image analysis methods published in
literature. We organize the methods by the image analysis task and by the type
of machine learning algorithm, and present a two-way mapping between the image
analysis tasks and the types of machine learning algorithms that can be applied
to them. The paper is comprehensive in coverage of both hyperspectral image
analysis tasks and machine learning algorithms. The image analysis tasks
considered are land cover classification, target detection, unmixing, and
physical parameter estimation. The machine learning algorithms covered are
Gaussian models, linear regression, logistic regression, support vector
machines, Gaussian mixture model, latent linear models, sparse linear models,
Gaussian mixture models, ensemble learning, directed graphical models,
undirected graphical models, clustering, Gaussian processes, Dirichlet
processes, and deep learning. We also discuss the open challenges in the field
of hyperspectral image analysis and explore possible future directions
Multilevel Discretized Random Field Models with "Spin" Correlations for the Simulation of Environmental Spatial Data
A problem of practical significance is the analysis of large, spatially
distributed data sets. The problem is more challenging for variables that
follow non-Gaussian distributions. We show that the spatial correlations
between variables can be captured by interactions between "spins". The spins
represent multilevel discretizations of the initial field with respect to a
number of pre-defined thresholds. The spatial dependence between the "spins" is
imposed by means of short-range interactions. We present two approaches,
inspired by the Ising and Potts models, that generate conditional simulations
from samples with missing data. The simulations of the "spin system" are forced
to respect locally the sample values and the system statistics globally. We
compare the two approaches in terms of their ability to reproduce the sample
statistical properties, to predict data at unsampled locations, as well as in
terms of their computational complexity. We discuss the impact of relevant
simulation parameters, such as the domain size, the number of discretization
levels, and the initial conditions.Comment: 20 pages, 8 figures. Presented at the Sigma Phi 2008 conference,
http://www2.polito.it/eventi/sigmaphi2008
Semi-supervised learning for structured regression on partially observed attributed graphs
Conditional probabilistic graphical models provide a powerful framework for
structured regression in spatio-temporal datasets with complex correlation
patterns. However, in real-life applications a large fraction of observations
is often missing, which can severely limit the representational power of these
models. In this paper we propose a Marginalized Gaussian Conditional Random
Fields (m-GCRF) structured regression model for dealing with missing labels in
partially observed temporal attributed graphs. This method is aimed at learning
with both labeled and unlabeled parts and effectively predicting future values
in a graph. The method is even capable of learning from nodes for which the
response variable is never observed in history, which poses problems for many
state-of-the-art models that can handle missing data. The proposed model is
characterized for various missingness mechanisms on 500 synthetic graphs. The
benefits of the new method are also demonstrated on a challenging application
for predicting precipitation based on partial observations of climate variables
in a temporal graph that spans the entire continental US. We also show that the
method can be useful for optimizing the costs of data collection in climate
applications via active reduction of the number of weather stations to
consider. In experiments on these real-world and synthetic datasets we show
that the proposed model is consistently more accurate than alternative
semi-supervised structured models, as well as models that either use imputation
to deal with missing values or simply ignore them altogether.Comment: Proceedings of the 2015 SIAM International Conference on Data Mining
(SDM 2015) Vancouver, Canada, April 30 - May 02, 201
- …