1,041 research outputs found

    Scalable Bayesian nonparametric measures for exploring pairwise dependence via Dirichlet Process Mixtures

    Get PDF
    In this article we propose novel Bayesian nonparametric methods using Dirichlet Process Mixture (DPM) models for detecting pairwise dependence between random variables while accounting for uncertainty in the form of the underlying distributions. A key criteria is that the procedures should scale to large data sets. In this regard we find that the formal calculation of the Bayes factor for a dependent-vs.-independent DPM joint probability measure is not feasible computationally. To address this we present Bayesian diagnostic measures for characterising evidence against a "null model" of pairwise independence. In simulation studies, as well as for a real data analysis, we show that our approach provides a useful tool for the exploratory nonparametric Bayesian analysis of large multivariate data sets

    Encrypted statistical machine learning: new privacy preserving methods

    Full text link
    We present two new statistical machine learning methods designed to learn on fully homomorphic encrypted (FHE) data. The introduction of FHE schemes following Gentry (2009) opens up the prospect of privacy preserving statistical machine learning analysis and modelling of encrypted data without compromising security constraints. We propose tailored algorithms for applying extremely random forests, involving a new cryptographic stochastic fraction estimator, and na\"{i}ve Bayes, involving a semi-parametric model for the class decision boundary, and show how they can be used to learn and predict from encrypted data. We demonstrate that these techniques perform competitively on a variety of classification data sets and provide detailed information about the computational practicalities of these and other FHE methods.Comment: 39 page

    Two-sample Bayesian Nonparametric Hypothesis Testing

    Full text link
    In this article we describe Bayesian nonparametric procedures for two-sample hypothesis testing. Namely, given two sets of samples y(1)  \mathbf{y}^{\scriptscriptstyle(1)}\;\stackrel{\scriptscriptstyle{iid}}{\s im}  F(1)\;F^{\scriptscriptstyle(1)} and y(2)  \mathbf{y}^{\scriptscriptstyle(2 )}\;\stackrel{\scriptscriptstyle{iid}}{\sim}  F(2)\;F^{\scriptscriptstyle( 2)}, with F(1),F(2)F^{\scriptscriptstyle(1)},F^{\scriptscriptstyle(2)} unknown, we wish to evaluate the evidence for the null hypothesis H0:F(1)≡F(2)H_0:F^{\scriptscriptstyle(1)}\equiv F^{\scriptscriptstyle(2)} versus the alternative H1:F(1)≠F(2)H_1:F^{\scriptscriptstyle(1)}\neq F^{\scriptscriptstyle(2)}. Our method is based upon a nonparametric P\'{o}lya tree prior centered either subjectively or using an empirical procedure. We show that the P\'{o}lya tree prior leads to an analytic expression for the marginal likelihood under the two hypotheses and hence an explicit measure of the probability of the null Pr(H0∣{y(1),y(2)})\mathrm{Pr}(H_0|\{\mathbf {y}^{\scriptscriptstyle(1)},\mathbf{y}^{\scriptscriptstyle(2)}\}\mathbf{)}.Comment: Published at http://dx.doi.org/10.1214/14-BA914 in the Bayesian Analysis (http://projecteuclid.org/euclid.ba) by the International Society of Bayesian Analysis (http://bayesian.org/

    Population-Based Reversible Jump Markov Chain Monte Carlo

    Full text link
    In this paper we present an extension of population-based Markov chain Monte Carlo (MCMC) to the trans-dimensional case. One of the main challenges in MCMC-based inference is that of simulating from high and trans-dimensional target measures. In such cases, MCMC methods may not adequately traverse the support of the target; the simulation results will be unreliable. We develop population methods to deal with such problems, and give a result proving the uniform ergodicity of these population algorithms, under mild assumptions. This result is used to demonstrate the superiority, in terms of convergence rate, of a population transition kernel over a reversible jump sampler for a Bayesian variable selection problem. We also give an example of a population algorithm for a Bayesian multivariate mixture model with an unknown number of components. This is applied to gene expression data of 1000 data points in six dimensions and it is demonstrated that our algorithm out performs some competing Markov chain samplers

    Power-efficiency enhanced thermally tunable Bragg grating for silica-on-silicon photonics

    No full text
    A thermally tunable Bragg grating device has been fabricated in a silica-on-silicon integrated optical chip, incorporating a suspended microbeam improving power efficiency. A waveguide and Bragg grating are defined through the middle of the microbeam via direct ultraviolet writing. A tuning range of 0.4 nm (50 GHz) is demonstrated at the telecommunication wavelength of 1550 nm. Power consumption during wavelength tuning is measured at 45 pm/mW, which is a factor of 90 better than reported values for similar bulk thermally tuned silica-on-silicon planar devices. The response time to a step change in heating is longer by a similar factor, as expected for a highly power-efficient device. The fabrication procedure involves a deep micromilling process, as well as wet etching and metal deposition. With this response, the device would be suitable for trimming applications and wherever low modulation frequencies are acceptable. A four-point-probe-based temperature measurement was also done to ascertain the temperature reached during tuning and found an average volume temperature of 48 °C, corresponding to 0.4 nm of tuning. The role of stress-induced buckling in device fabrication is included
    • …
    corecore