3,482 research outputs found

    Effect fusion using model-based clustering

    Get PDF
    In social and economic studies many of the collected variables are measured on a nominal scale, often with a large number of categories. The definition of categories is usually not unambiguous and different classification schemes using either a finer or a coarser grid are possible. Categorisation has an impact when such a variable is included as covariate in a regression model: a too fine grid will result in imprecise estimates of the corresponding effects, whereas with a too coarse grid important effects will be missed, resulting in biased effect estimates and poor predictive performance. To achieve automatic grouping of levels with essentially the same effect, we adopt a Bayesian approach and specify the prior on the level effects as a location mixture of spiky normal components. Fusion of level effects is induced by a prior on the mixture weights which encourages empty components. Model-based clustering of the effects during MCMC sampling allows to simultaneously detect categories which have essentially the same effect size and identify variables with no effect at all. The properties of this approach are investigated in simulation studies. Finally, the method is applied to analyse effects of high-dimensional categorical predictors on income in Austria

    Anomaly Detection and Removal Using Non-Stationary Gaussian Processes

    Full text link
    This paper proposes a novel Gaussian process approach to fault removal in time-series data. Fault removal does not delete the faulty signal data but, instead, massages the fault from the data. We assume that only one fault occurs at any one time and model the signal by two separate non-parametric Gaussian process models for both the physical phenomenon and the fault. In order to facilitate fault removal we introduce the Markov Region Link kernel for handling non-stationary Gaussian processes. This kernel is piece-wise stationary but guarantees that functions generated by it and their derivatives (when required) are everywhere continuous. We apply this kernel to the removal of drift and bias errors in faulty sensor data and also to the recovery of EOG artifact corrupted EEG signals.Comment: 9 pages, 14 figure

    Non-convex regularization in remote sensing

    Get PDF
    In this paper, we study the effect of different regularizers and their implications in high dimensional image classification and sparse linear unmixing. Although kernelization or sparse methods are globally accepted solutions for processing data in high dimensions, we present here a study on the impact of the form of regularization used and its parametrization. We consider regularization via traditional squared (2) and sparsity-promoting (1) norms, as well as more unconventional nonconvex regularizers (p and Log Sum Penalty). We compare their properties and advantages on several classification and linear unmixing tasks and provide advices on the choice of the best regularizer for the problem at hand. Finally, we also provide a fully functional toolbox for the community.Comment: 11 pages, 11 figure

    Enhancing models and measurements of traffic-related air pollutants for health studies using dispersion modeling and Bayesian data fusion

    Get PDF
    Research Report 202 describes a study led by Dr. Stuart Batterman at the University of Michigan, Ann Arbor and colleagues. The investigators evaluated the ability to predict traffic-related air pollution using a variety of methods and models, including a line source air pollution dispersion model and sophisticated spatiotemporal Bayesian data fusion methods. Exposure assessment for traffic-related air pollution is challenging because the pollutants are a complex mixture and vary greatly over space and time. Because extensive direct monitoring is difficult and expensive, a number of modeling approaches have been developed, but each model has its own limitations and errors. Dr. Batterman and colleagues sought to improve model estimations by applying and systematically comparing the performance of different statistical models. The study made extensive use of data collected in the Near-road EXposures and effects of Urban air pollutants Study (NEXUS), a cohort study designed to examine the relationship between near-roadway pollutant exposures and respiratory outcomes in children with asthma who live close to major roadways in Detroit, Michigan
    • …
    corecore