37,116 research outputs found

    Gibbs Max-margin Topic Models with Data Augmentation

    Full text link
    Max-margin learning is a powerful approach to building classifiers and structured output predictors. Recent work on max-margin supervised topic models has successfully integrated it with Bayesian topic models to discover discriminative latent semantic structures and make accurate predictions for unseen testing data. However, the resulting learning problems are usually hard to solve because of the non-smoothness of the margin loss. Existing approaches to building max-margin supervised topic models rely on an iterative procedure to solve multiple latent SVM subproblems with additional mean-field assumptions on the desired posterior distributions. This paper presents an alternative approach by defining a new max-margin loss. Namely, we present Gibbs max-margin supervised topic models, a latent variable Gibbs classifier to discover hidden topic representations for various tasks, including classification, regression and multi-task learning. Gibbs max-margin supervised topic models minimize an expected margin loss, which is an upper bound of the existing margin loss derived from an expected prediction rule. By introducing augmented variables and integrating out the Dirichlet variables analytically by conjugacy, we develop simple Gibbs sampling algorithms with no restricting assumptions and no need to solve SVM subproblems. Furthermore, each step of the "augment-and-collapse" Gibbs sampling algorithms has an analytical conditional distribution, from which samples can be easily drawn. Experimental results demonstrate significant improvements on time efficiency. The classification performance is also significantly improved over competitors on binary, multi-class and multi-label classification tasks.Comment: 35 page

    PAC-Bayes and Domain Adaptation

    Get PDF
    We provide two main contributions in PAC-Bayesian theory for domain adaptation where the objective is to learn, from a source distribution, a well-performing majority vote on a different, but related, target distribution. Firstly, we propose an improvement of the previous approach we proposed in Germain et al. (2013), which relies on a novel distribution pseudodistance based on a disagreement averaging, allowing us to derive a new tighter domain adaptation bound for the target risk. While this bound stands in the spirit of common domain adaptation works, we derive a second bound (introduced in Germain et al., 2016) that brings a new perspective on domain adaptation by deriving an upper bound on the target risk where the distributions' divergence-expressed as a ratio-controls the trade-off between a source error measure and the target voters' disagreement. We discuss and compare both results, from which we obtain PAC-Bayesian generalization bounds. Furthermore, from the PAC-Bayesian specialization to linear classifiers, we infer two learning algorithms, and we evaluate them on real data.Comment: Neurocomputing, Elsevier, 2019. arXiv admin note: substantial text overlap with arXiv:1503.0694

    Bayesian variable selection using cost-adjusted BIC, with application to cost-effective measurement of quality of health care

    Full text link
    In the field of quality of health care measurement, one approach to assessing patient sickness at admission involves a logistic regression of mortality within 30 days of admission on a fairly large number of sickness indicators (on the order of 100) to construct a sickness scale, employing classical variable selection methods to find an ``optimal'' subset of 10--20 indicators. Such ``benefit-only'' methods ignore the considerable differences among the sickness indicators in cost of data collection, an issue that is crucial when admission sickness is used to drive programs (now implemented or under consideration in several countries, including the U.S. and U.K.) that attempt to identify substandard hospitals by comparing observed and expected mortality rates (given admission sickness). When both data-collection cost and accuracy of prediction of 30-day mortality are considered, a large variable-selection problem arises in which costly variables that do not predict well enough should be omitted from the final scale. In this paper (a) we develop a method for solving this problem based on posterior model odds, arising from a prior distribution that (1) accounts for the cost of each variable and (2) results in a set of posterior model probabilities that corresponds to a generalized cost-adjusted version of the Bayesian information criterion (BIC), and (b) we compare this method with a decision-theoretic cost-benefit approach based on maximizing expected utility. We use reversible-jump Markov chain Monte Carlo (RJMCMC) methods to search the model space, and we check the stability of our findings with two variants of the MCMC model composition (MC3\mathit{MC}^3) algorithm.Comment: Published in at http://dx.doi.org/10.1214/08-AOAS207 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Sequential Design for Ranking Response Surfaces

    Full text link
    We propose and analyze sequential design methods for the problem of ranking several response surfaces. Namely, given L2L \ge 2 response surfaces over a continuous input space X\cal X, the aim is to efficiently find the index of the minimal response across the entire X\cal X. The response surfaces are not known and have to be noisily sampled one-at-a-time. This setting is motivated by stochastic control applications and requires joint experimental design both in space and response-index dimensions. To generate sequential design heuristics we investigate stepwise uncertainty reduction approaches, as well as sampling based on posterior classification complexity. We also make connections between our continuous-input formulation and the discrete framework of pure regret in multi-armed bandits. To model the response surfaces we utilize kriging surrogates. Several numerical examples using both synthetic data and an epidemics control problem are provided to illustrate our approach and the efficacy of respective adaptive designs.Comment: 26 pages, 7 figures (updated several sections and figures

    Bayesian functional linear regression with sparse step functions

    Full text link
    The functional linear regression model is a common tool to determine the relationship between a scalar outcome and a functional predictor seen as a function of time. This paper focuses on the Bayesian estimation of the support of the coefficient function. To this aim we propose a parsimonious and adaptive decomposition of the coefficient function as a step function, and a model including a prior distribution that we name Bayesian functional Linear regression with Sparse Step functions (Bliss). The aim of the method is to recover areas of time which influences the most the outcome. A Bayes estimator of the support is built with a specific loss function, as well as two Bayes estimators of the coefficient function, a first one which is smooth and a second one which is a step function. The performance of the proposed methodology is analysed on various synthetic datasets and is illustrated on a black P\'erigord truffle dataset to study the influence of rainfall on the production

    Robust estimation of risks from small samples

    Get PDF
    Data-driven risk analysis involves the inference of probability distributions from measured or simulated data. In the case of a highly reliable system, such as the electricity grid, the amount of relevant data is often exceedingly limited, but the impact of estimation errors may be very large. This paper presents a robust nonparametric Bayesian method to infer possible underlying distributions. The method obtains rigorous error bounds even for small samples taken from ill-behaved distributions. The approach taken has a natural interpretation in terms of the intervals between ordered observations, where allocation of probability mass across intervals is well-specified, but the location of that mass within each interval is unconstrained. This formulation gives rise to a straightforward computational resampling method: Bayesian Interval Sampling. In a comparison with common alternative approaches, it is shown to satisfy strict error bounds even for ill-behaved distributions.Comment: 13 pages, 3 figures; supplementary information provided. A revised version of this manuscript has been accepted for publication in Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Science
    corecore