    Relabeling and Summarizing Posterior Distributions in Signal Decomposition Problems when the Number of Components is Unknown

    International audienceThis paper addresses the problems of relabeling and summarizing posterior distributions that typically arise, in a Bayesian framework, when dealing with signal decomposition problems with an unknown number of components. Such posterior distributions are defined over union of subspaces of differing dimensionality and can be sampled from using modern Monte Carlo techniques, for instance the increasingly popular RJ-MCMC method. No generic approach is available, however, to summarize the resulting variable-dimensional samples and extract from them component-specific parameters. We propose a novel approach, named Variable-dimensional Approximate Posterior for Relabeling and Summarizing (VAPoRS), to this problem, which consists in approximating the posterior distribution of interest by a "simple"---but still variable-dimensional---parametric distribution. The distance between the two distributions is measured using the Kullback-Leibler divergence, and a Stochastic EM-type algorithm, driven by the RJ-MCMC sampler, is proposed to estimate the parameters. Two signal decomposition problems are considered, to show the capability of VAPoRS both for relabeling and for summarizing variable dimensional posterior distributions: the classical problem of detecting and estimating sinusoids in white Gaussian noise on the one hand, and a particle counting problem motivated by the Pierre Auger project in astrophysics on the other hand

    Summarizing Posterior Distributions in Signal Decomposition Problems when the Number of Components is Unknown

    International audienceThis paper addresses the problem of summarizing the posterior distributions that typically arise, in a Bayesian framework, when dealing with signal decomposition problems with unknown number of components. Such posterior distributions are defined over union of subspaces of differing dimensionality and can be sampled from using modern Monte Carlo techniques, for instance the increasingly popular RJ-MCMC method. No generic approach is available, however, to summarize the resulting variable-dimensional samples and extract from them component-specific parameters. We propose a novel approach to this problem, which consists in approximating the complex posterior of interest by a "simple"--but still variable-dimensional--parametric distribution. The distance between the two distributions is measured using the Kullback- Leibler divergence, and a Stochastic EM-type algorithm, driven by the RJ-MCMC sampler, is proposed to estimate the parameters. The proposed algorithm is illustrated on the fundamental signal processing example of joint detection and estimation of sinusoids in white Gaussian noise

    Adaptive MCMC with online relabeling

    When targeting a distribution that is artificially invariant under some permutations, Markov chain Monte Carlo (MCMC) algorithms face the label-switching problem, rendering marginal inference particularly cumbersome. Such a situation arises, for example, in the Bayesian analysis of finite mixture models. Adaptive MCMC algorithms such as adaptive Metropolis (AM), which self-calibrates its proposal distribution using an online estimate of the covariance matrix of the target, are no exception. To address the label-switching issue, relabeling algorithms associate a permutation to each MCMC sample, trying to obtain reasonable marginals. In the case of adaptive Metropolis (Bernoulli 7 (2001) 223-242), an online relabeling strategy is required. This paper is devoted to the AMOR algorithm, a provably consistent variant of AM that can cope with the label-switching problem. The idea is to nest relabeling steps within the MCMC algorithm based on the estimation of a single covariance matrix that is used both for adapting the covariance of the proposal distribution in the Metropolis algorithm step and for online relabeling. We compare the behavior of AMOR to similar relabeling methods. In the case of compactly supported target distributions, we prove a strong law of large numbers for AMOR and its ergodicity. These are the first results on the consistency of an online relabeling algorithm to our knowledge. The proof underlines latent relations between relabeling and vector quantization.Comment: Published at http://dx.doi.org/10.3150/13-BEJ578 in the Bernoulli (http://isi.cbs.nl/bernoulli/) by the International Statistical Institute/Bernoulli Society (http://isi.cbs.nl/BS/bshome.htm

    A bayesian approach to model-based clustering for panel probit models

    Consideration of latent heterogeneity is of special importance in non linear models for gauging correctly the effect of explaining variables on the dependent variable. This paper adopts the stratified model-based clustering approach for modeling latent heterogeneity for panel probit models. Within a Bayesian framework an estimation algorithm dealing with the inherent label switching problem is provided. Determination of the number of clusters is based on the marginal likelihood and out-of-sample criteria. The ability to decide on the correct number of clusters is assessed within a simulation study indicating high accuracy for both approaches. Different concepts of marginal effects incorporating latent heterogeneity at different degrees arise within the considered model setup and are directly at hand within Bayesian estimation via MCMC methodology. An empirical illustration of the developed methodology indicates that consideration of latent heterogeneity via latent clusters provides the preferred model specification compared to a pooled and a random coefficient specification. --Bayesian Estimation,MCMC Methods,Panel Probit Model,Mixture Modelling

    The Kernel Interaction Trick: Fast Bayesian Discovery of Pairwise Interactions in High Dimensions

    Discovering interaction effects on a response of interest is a fundamental problem faced in biology, medicine, economics, and many other scientific disciplines. In theory, Bayesian methods for discovering pairwise interactions enjoy many benefits such as coherent uncertainty quantification, the ability to incorporate background knowledge, and desirable shrinkage properties. In practice, however, Bayesian methods are often computationally intractable for even moderate-dimensional problems. Our key insight is that many hierarchical models of practical interest admit a particular Gaussian process (GP) representation; the GP allows us to capture the posterior with a vector of O(p) kernel hyper-parameters rather than O(p^2) interactions and main effects. With the implicit representation, we can run Markov chain Monte Carlo (MCMC) over model hyper-parameters in time and memory linear in p per iteration. We focus on sparsity-inducing models and show on datasets with a variety of covariate behaviors that our method: (1) reduces runtime by orders of magnitude over naive applications of MCMC, (2) provides lower Type I and Type II error relative to state-of-the-art LASSO-based approaches, and (3) offers improved computational scaling in high dimensions relative to existing Bayesian and LASSO-based approaches.Comment: Accepted at ICML 2019. 20 pages, 4 figures, 3 table

    A Bayesian Approach for Model-Based Clustering of Several Binary Dissimilarity Matrices: The dmbc Package in R

    We introduce the new package dmbc that implements a Bayesian algorithm for clustering a set of binary dissimilarity matrices within a model-based framework. Specifically, we consider the case when S matrices are available, each describing the dissimilarities among the same n objects, possibly expressed by S subjects (judges), or measured under different experimental conditions, or with reference to different characteristics of the objects themselves. In particular, we focus on binary dissimilarities, taking values 0 or 1 depending on whether or not two objects are deemed as dissimilar. We are interested in analyzing such data using multidimensional scaling (MDS). Differently from standard MDS algorithms, our goal is to cluster the dissimilarity matrices and, simultaneously, to extract an MDS configuration specific for each cluster. To this end, we develop a fully Bayesian three-way MDS approach, where the elements of each dissimilarity matrix are modeled as a mixture of Bernoulli random vectors. The parameter estimates and the MDS configurations are derived using a hybrid Metropolis-Gibbs Markov Chain Monte Carlo algorithm. We also propose a BIC-like criterion for jointly selecting the optimal number of clusters and latent space dimensions. We illustrate our approach referring both to synthetic data and to a publicly available data set taken from the literature. For the sake of efficiency, the core computations in the package are implemented in C/C++. The package also allows the simulation of multiple chains through the support of the parallel package

    EEG and MEG data analysis in SPM8.

    SPM is a free and open source software written in MATLAB (The MathWorks, Inc.). In addition to standard M/EEG preprocessing, we presently offer three main analysis tools: (i) statistical analysis of scalp-maps, time-frequency images, and volumetric 3D source reconstruction images based on the general linear model, with correction for multiple comparisons using random field theory; (ii) Bayesian M/EEG source reconstruction, including support for group studies, simultaneous EEG and MEG, and fMRI priors; (iii) dynamic causal modelling (DCM), an approach combining neural modelling with data analysis for which there are several variants dealing with evoked responses, steady state responses (power spectra and cross-spectra), induced responses, and phase coupling. SPM8 is integrated with the FieldTrip toolbox , making it possible for users to combine a variety of standard analysis methods with new schemes implemented in SPM and build custom analysis tools using powerful graphical user interface (GUI) and batching tools

    Adaptive algorithms for real-world transactional data mining.

    The accurate identification of the right customer to target with the right product at the right time, through the right channel, to satisfy the customer’s evolving needs, is a key performance driver and enhancer for businesses. Data mining is an analytic process designed to explore usually large amounts of data (typically business or market related) in search of consistent patterns and/or systematic relationships between variables for the purpose of generating explanatory/predictive data models from the detected patterns. It provides an effective and established mechanism for accurate identification and classification of customers. Data models derived from the data mining process can aid in effectively recognizing the status and preference of customers - individually and as a group. Such data models can be incorporated into the business market segmentation, customer targeting and channelling decisions with the goal of maximizing the total customer lifetime profit. However, due to costs, privacy and/or data protection reasons, the customer data available for data mining is often restricted to verified and validated data,(in most cases,only the business owned transactional data is available). Transactional data is a valuable resource for generating such data models. Transactional data can be electronically collected and readily made available for data mining in large quantity at minimum extra cost. Transactional data is however, inherently sparse and skewed. These inherent characteristics of transactional data give rise to the poor performance of data models built using customer data based on transactional data. Data models for identifying, describing, and classifying customers, constructed using evolving transactional data thus need to effectively handle the inherent sparseness and skewness of evolving transactional data in order to be efficient and accurate. Using real-world transactional data, this thesis presents the findings and results from the investigation of data mining algorithms for analysing, describing, identifying and classifying customers with evolving needs. In particular, methods for handling the issues of scalability, uncertainty and adaptation whilst mining evolving transactional data are analysed and presented. A novel application of a new framework for integrating transactional data binning and classification techniques is presented alongside an effective prototype selection algorithm for efficient transactional data model building. A new change mining architecture for monitoring, detecting and visualizing the change in customer behaviour using transactional data is proposed and discussed as an effective means for analysing and understanding the change in customer buying behaviour over time. Finally, the challenging problem of discerning between the change in the customer profile (which may necessitate the effective change of the customer’s label) and the change in performance of the model(s) (which may necessitate changing or adapting the model(s)) is introduced and discussed by way of a novel flexible and efficient architecture for classifier model adaptation and customer profiles class relabeling

    Essays in Statistics

    This thesis is comprised of several contributions to the field of mathematical statistics, particularly with regards to computational issues of Bayesian statistics and functional data analysis. The first two chapters are concerned with computational Bayesian approaches that allow one to generate samples from an approximation to the posterior distribution in settings where the likelihood function of some statistical model of interest is unknown. This has led to a class of Approximate Bayesian Computation (ABC) methods whose performance depends on the ability to effectively summarize the information content of the data sample by a lower-dimensional vector of summary statistics. Ideally, these statistics are sufficient for the parameter of interest. However, it is difficult to establish sufficiency in a straightforward way if the likelihood of the model is unavailable. In Chapter 1 we propose an indirect approach to select sufficient summary statistics for ABC methods that borrows its intuition from the indirect estimation literature in econometrics. More precisely, we introduce an auxiliary statistical model that is large enough as to contain the structural model of interest. Summary statistics are then identified in this auxiliary model and mapped to the structural model of interest. We show sufficiency of these statistics for Indirect ABC methods based on parameter estimates (ABC-IP), likelihood functions (ABC-IL) and scores (ABC-IS) of the auxiliary model. A detailed simulation study investigates the performance of each proposal and compares it to a traditional, moment-based ABC approach. Particularly, the ABC-IL and ABC-IS algorithms are shown to perform better than both standard ABC and the ABC-IP methods. In Chapter 2 we extend the notion of Indirect ABC methods by proposing an efficient way of weighting the individual entries of the vector of summary statistics obtained from the score-based Indirect ABC approach (ABC-IS). In particular, the weighting matrix is given by the inverse of the asymptotic covariance matrix of the score vector of the auxiliary model and allows us to appropriately assess the distance between the true posterior distribution and the approximation based on the ABC-IS method. We illustrate the performance gain in a simulation study. An empirical application then implements the weighted ABC-IS method to the problem of estimating a continuous-time stochastic volatility model based on non-Gaussian Ornstein-Uhlenbeck processes. We show how a suitable auxiliary model can be constructed and confirm estimation results from concurring Bayesian estimation approaches suggested in the literature. In Chapter 3 we consider the problem of sampling from high-dimensional probability distributions that exhibit multiple, well-separated modes. Such distributions arise frequently, for instance, in the Bayesian estimation of macroeconomic DSGE models. Standard Markov Chain Monte Carlo (MCMC) methods, such as the Metropolis-Hastings algorithm, are prone to get trapped in local neighborhoods of the target distribution thus severely limiting the use of these methods in more complex models. We suggest the use of a Sequential Markov Chain Monte Carlo approach to overcome these difficulties and investigate its finite sample properties. The results show that Sequential MCMC methods clearly outperform standard MCMC approaches in a multimodal setting and can recover both the location as well as the mixture weights in a 12-dimensional mixture model. Moreover, we provide a detailed comparison of the effects different choices of tuning parameters have on the approximation to the true sampling distribution. These results can serve as valuable guidelines when applying this method to more complex economic models, such as the (Bayesian) estimation of Dynamic Stochastic General Equilibrium models. Chapters 4 and 5 study the statistical problem of prediction from a functional perspective. In many statistical applications, data is becoming available at ever increasing frequencies and it has thus become natural to think of discrete observations as realizations of a continuous function, say over the course of one day. However, as functions are generally speaking infinite-dimensional objects, the statistical analysis of such functional data is intrinsically different from standard multivariate techniques. In Chapter 4 we consider prediction in functional additive models of first-order autoregressive type for a time series of functional observations. This is a generalization of functional linear models that are commonly considered in the literature and has two advantages to be applied in a functional time series setting. First, it allows us to introduce a very general notion of time dependencies for functional data in this modeling framework. Particularly, it is rooted at the correlation structure of functional principal component scores and even allows for long memory behavior in the score series across the time dimension. Second, prediction in this modeling framework is straightforwardly implemented as it only concerns conditional means of scalar random variables and we suggest a k-nearest neighbors classification scheme. The theoretical contributions of this paper are twofold. In a first step, we verify the applicability of the functional principal components analysis under our notion of time dependence and obtain precise rates of convergence for the mean function and the covariance operator associated with the observed sample of functions. In a second step, we derive precise rates of convergence of the mean squared error for the proposed predictor, taking into account both the effect of truncating the infinite series expansion at some finite integer L as well as the effect of estimating the covariance operator and associated eigenelements based on a sample of N curves. In Chapter 5 we investigate the performance of functional models in a forecasting study of ground-level ozone-concentration surfaces over the geographical domain of Germany. Our perspective thus differs from the literature on spatially distributed functional processes (which are considered to be (univariate) functions of time that show spatial dependence) in that we consider smooth surfaces defined over some spatial domain that are sampled consecutively over time. In particular, we treat discrete observations that are sampled both over a spatial domain and over time as noisy realizations of some time series of smooth bivariate functions. In a first step we therefore discuss how smooth functions can be reconstructed from such noisy measurements through a finite element spline smoother that is defined over some triangulation of the spatial domain. In a second step we consider two forecasting approaches to functional time series. The first one is a functional linear model of first-order auto-regressive type, whereas the second considers the non-parametric extension to functional additive models discussed in Chapter 4. Both approaches are applied to predicting ground-level ozone concentration measured over the spatial domain of Germany and are shown to yield similar predictions

    A Surrogate Model of Gravitational Waveforms from Numerical Relativity Simulations of Precessing Binary Black Hole Mergers

    We present the first surrogate model for gravitational waveforms from the coalescence of precessing binary black holes. We call this surrogate model NRSur4d2s. Our methodology significantly extends recently introduced reduced-order and surrogate modeling techniques, and is capable of directly modeling numerical relativity waveforms without introducing phenomenological assumptions or approximations to general relativity. Motivated by GW150914, LIGO's first detection of gravitational waves from merging black holes, the model is built from a set of 276276 numerical relativity (NR) simulations with mass ratios q2q \leq 2, dimensionless spin magnitudes up to 0.80.8, and the restriction that the initial spin of the smaller black hole lies along the axis of orbital angular momentum. It produces waveforms which begin 30\sim 30 gravitational wave cycles before merger and continue through ringdown, and which contain the effects of precession as well as all {2,3}\ell \in \{2, 3\} spin-weighted spherical-harmonic modes. We perform cross-validation studies to compare the model to NR waveforms \emph{not} used to build the model, and find a better agreement within the parameter range of the model than other, state-of-the-art precessing waveform models, with typical mismatches of 10310^{-3}. We also construct a frequency domain surrogate model (called NRSur4d2s_FDROM) which can be evaluated in 50ms50\, \mathrm{ms} and is suitable for performing parameter estimation studies on gravitational wave detections similar to GW150914.Comment: 34 pages, 26 figure