37,116 research outputs found
Gibbs Max-margin Topic Models with Data Augmentation
Max-margin learning is a powerful approach to building classifiers and
structured output predictors. Recent work on max-margin supervised topic models
has successfully integrated it with Bayesian topic models to discover
discriminative latent semantic structures and make accurate predictions for
unseen testing data. However, the resulting learning problems are usually hard
to solve because of the non-smoothness of the margin loss. Existing approaches
to building max-margin supervised topic models rely on an iterative procedure
to solve multiple latent SVM subproblems with additional mean-field assumptions
on the desired posterior distributions. This paper presents an alternative
approach by defining a new max-margin loss. Namely, we present Gibbs max-margin
supervised topic models, a latent variable Gibbs classifier to discover hidden
topic representations for various tasks, including classification, regression
and multi-task learning. Gibbs max-margin supervised topic models minimize an
expected margin loss, which is an upper bound of the existing margin loss
derived from an expected prediction rule. By introducing augmented variables
and integrating out the Dirichlet variables analytically by conjugacy, we
develop simple Gibbs sampling algorithms with no restricting assumptions and no
need to solve SVM subproblems. Furthermore, each step of the
"augment-and-collapse" Gibbs sampling algorithms has an analytical conditional
distribution, from which samples can be easily drawn. Experimental results
demonstrate significant improvements on time efficiency. The classification
performance is also significantly improved over competitors on binary,
multi-class and multi-label classification tasks.Comment: 35 page
PAC-Bayes and Domain Adaptation
We provide two main contributions in PAC-Bayesian theory for domain
adaptation where the objective is to learn, from a source distribution, a
well-performing majority vote on a different, but related, target distribution.
Firstly, we propose an improvement of the previous approach we proposed in
Germain et al. (2013), which relies on a novel distribution pseudodistance
based on a disagreement averaging, allowing us to derive a new tighter domain
adaptation bound for the target risk. While this bound stands in the spirit of
common domain adaptation works, we derive a second bound (introduced in Germain
et al., 2016) that brings a new perspective on domain adaptation by deriving an
upper bound on the target risk where the distributions' divergence-expressed as
a ratio-controls the trade-off between a source error measure and the target
voters' disagreement. We discuss and compare both results, from which we obtain
PAC-Bayesian generalization bounds. Furthermore, from the PAC-Bayesian
specialization to linear classifiers, we infer two learning algorithms, and we
evaluate them on real data.Comment: Neurocomputing, Elsevier, 2019. arXiv admin note: substantial text
overlap with arXiv:1503.0694
Bayesian variable selection using cost-adjusted BIC, with application to cost-effective measurement of quality of health care
In the field of quality of health care measurement, one approach to assessing
patient sickness at admission involves a logistic regression of mortality
within 30 days of admission on a fairly large number of sickness indicators (on
the order of 100) to construct a sickness scale, employing classical variable
selection methods to find an ``optimal'' subset of 10--20 indicators. Such
``benefit-only'' methods ignore the considerable differences among the sickness
indicators in cost of data collection, an issue that is crucial when admission
sickness is used to drive programs (now implemented or under consideration in
several countries, including the U.S. and U.K.) that attempt to identify
substandard hospitals by comparing observed and expected mortality rates (given
admission sickness). When both data-collection cost and accuracy of prediction
of 30-day mortality are considered, a large variable-selection problem arises
in which costly variables that do not predict well enough should be omitted
from the final scale. In this paper (a) we develop a method for solving this
problem based on posterior model odds, arising from a prior distribution that
(1) accounts for the cost of each variable and (2) results in a set of
posterior model probabilities that corresponds to a generalized cost-adjusted
version of the Bayesian information criterion (BIC), and (b) we compare this
method with a decision-theoretic cost-benefit approach based on maximizing
expected utility. We use reversible-jump Markov chain Monte Carlo (RJMCMC)
methods to search the model space, and we check the stability of our findings
with two variants of the MCMC model composition () algorithm.Comment: Published in at http://dx.doi.org/10.1214/08-AOAS207 the Annals of
Applied Statistics (http://www.imstat.org/aoas/) by the Institute of
Mathematical Statistics (http://www.imstat.org
Sequential Design for Ranking Response Surfaces
We propose and analyze sequential design methods for the problem of ranking
several response surfaces. Namely, given response surfaces over a
continuous input space , the aim is to efficiently find the index of
the minimal response across the entire . The response surfaces are not
known and have to be noisily sampled one-at-a-time. This setting is motivated
by stochastic control applications and requires joint experimental design both
in space and response-index dimensions. To generate sequential design
heuristics we investigate stepwise uncertainty reduction approaches, as well as
sampling based on posterior classification complexity. We also make connections
between our continuous-input formulation and the discrete framework of pure
regret in multi-armed bandits. To model the response surfaces we utilize
kriging surrogates. Several numerical examples using both synthetic data and an
epidemics control problem are provided to illustrate our approach and the
efficacy of respective adaptive designs.Comment: 26 pages, 7 figures (updated several sections and figures
Bayesian functional linear regression with sparse step functions
The functional linear regression model is a common tool to determine the
relationship between a scalar outcome and a functional predictor seen as a
function of time. This paper focuses on the Bayesian estimation of the support
of the coefficient function. To this aim we propose a parsimonious and adaptive
decomposition of the coefficient function as a step function, and a model
including a prior distribution that we name Bayesian functional Linear
regression with Sparse Step functions (Bliss). The aim of the method is to
recover areas of time which influences the most the outcome. A Bayes estimator
of the support is built with a specific loss function, as well as two Bayes
estimators of the coefficient function, a first one which is smooth and a
second one which is a step function. The performance of the proposed
methodology is analysed on various synthetic datasets and is illustrated on a
black P\'erigord truffle dataset to study the influence of rainfall on the
production
Robust estimation of risks from small samples
Data-driven risk analysis involves the inference of probability distributions
from measured or simulated data. In the case of a highly reliable system, such
as the electricity grid, the amount of relevant data is often exceedingly
limited, but the impact of estimation errors may be very large. This paper
presents a robust nonparametric Bayesian method to infer possible underlying
distributions. The method obtains rigorous error bounds even for small samples
taken from ill-behaved distributions. The approach taken has a natural
interpretation in terms of the intervals between ordered observations, where
allocation of probability mass across intervals is well-specified, but the
location of that mass within each interval is unconstrained. This formulation
gives rise to a straightforward computational resampling method: Bayesian
Interval Sampling. In a comparison with common alternative approaches, it is
shown to satisfy strict error bounds even for ill-behaved distributions.Comment: 13 pages, 3 figures; supplementary information provided. A revised
version of this manuscript has been accepted for publication in Philosophical
Transactions of the Royal Society A: Mathematical, Physical and Engineering
Science
- …