894 research outputs found

    Scalable Bayesian nonparametric measures for exploring pairwise dependence via Dirichlet Process Mixtures

    Get PDF
    In this article we propose novel Bayesian nonparametric methods using Dirichlet Process Mixture (DPM) models for detecting pairwise dependence between random variables while accounting for uncertainty in the form of the underlying distributions. A key criteria is that the procedures should scale to large data sets. In this regard we find that the formal calculation of the Bayes factor for a dependent-vs.-independent DPM joint probability measure is not feasible computationally. To address this we present Bayesian diagnostic measures for characterising evidence against a "null model" of pairwise independence. In simulation studies, as well as for a real data analysis, we show that our approach provides a useful tool for the exploratory nonparametric Bayesian analysis of large multivariate data sets

    Encrypted statistical machine learning: new privacy preserving methods

    Full text link
    We present two new statistical machine learning methods designed to learn on fully homomorphic encrypted (FHE) data. The introduction of FHE schemes following Gentry (2009) opens up the prospect of privacy preserving statistical machine learning analysis and modelling of encrypted data without compromising security constraints. We propose tailored algorithms for applying extremely random forests, involving a new cryptographic stochastic fraction estimator, and na\"{i}ve Bayes, involving a semi-parametric model for the class decision boundary, and show how they can be used to learn and predict from encrypted data. We demonstrate that these techniques perform competitively on a variety of classification data sets and provide detailed information about the computational practicalities of these and other FHE methods.Comment: 39 page

    Population-Based Reversible Jump Markov Chain Monte Carlo

    Full text link
    In this paper we present an extension of population-based Markov chain Monte Carlo (MCMC) to the trans-dimensional case. One of the main challenges in MCMC-based inference is that of simulating from high and trans-dimensional target measures. In such cases, MCMC methods may not adequately traverse the support of the target; the simulation results will be unreliable. We develop population methods to deal with such problems, and give a result proving the uniform ergodicity of these population algorithms, under mild assumptions. This result is used to demonstrate the superiority, in terms of convergence rate, of a population transition kernel over a reversible jump sampler for a Bayesian variable selection problem. We also give an example of a population algorithm for a Bayesian multivariate mixture model with an unknown number of components. This is applied to gene expression data of 1000 data points in six dimensions and it is demonstrated that our algorithm out performs some competing Markov chain samplers

    Power-efficiency enhanced thermally tunable Bragg grating for silica-on-silicon photonics

    No full text
    A thermally tunable Bragg grating device has been fabricated in a silica-on-silicon integrated optical chip, incorporating a suspended microbeam improving power efficiency. A waveguide and Bragg grating are defined through the middle of the microbeam via direct ultraviolet writing. A tuning range of 0.4 nm (50 GHz) is demonstrated at the telecommunication wavelength of 1550 nm. Power consumption during wavelength tuning is measured at 45 pm/mW, which is a factor of 90 better than reported values for similar bulk thermally tuned silica-on-silicon planar devices. The response time to a step change in heating is longer by a similar factor, as expected for a highly power-efficient device. The fabrication procedure involves a deep micromilling process, as well as wet etching and metal deposition. With this response, the device would be suitable for trimming applications and wherever low modulation frequencies are acceptable. A four-point-probe-based temperature measurement was also done to ascertain the temperature reached during tuning and found an average volume temperature of 48 °C, corresponding to 0.4 nm of tuning. The role of stress-induced buckling in device fabrication is included

    Serum Fatty Acid Binding Protein 4 (FABP4) Predicts Pre-eclampsia in Women with Type 1 Diabetes

    Get PDF
    OBJECTIVE To examine the association between fatty acid binding protein 4 (FABP4) and pre-eclampsia risk in women with type 1 diabetes. RESEARCH DESIGN AND METHODS Serum FABP4 was measured in 710 women from the Diabetes and Pre-eclampsia Intervention Trial (DAPIT) in early pregnancy and in the second trimester (median 14 and 26 weeks’ gestation, respectively). RESULTS FABP4 was significantly elevated in early pregnancy (geometric mean 15.8 ng/mL [interquartile range 11.6–21.4] vs. 12.7 ng/mL [interquartile range 9.6–17]; P &amp;lt; 0.001) and the second trimester (18.8 ng/mL [interquartile range 13.6–25.8] vs. 14.6 ng/mL [interquartile range 10.8–19.7]; P &amp;lt; 0.001) in women in whom pre-eclampsia later developed. Elevated second-trimester FABP4 level was independently associated with pre-eclampsia (odds ratio 2.87 [95% CI 1.24–6.68], P = 0.03). The addition of FABP4 to established risk factors significantly improved net reclassification improvement at both time points and integrated discrimination improvement in the second trimester. CONCLUSIONS Increased second-trimester FABP4 independently predicted pre-eclampsia and significantly improved reclassification and discrimination. FABP4 shows potential as a novel biomarker for pre-eclampsia prediction in women with type 1 diabetes. </jats:sec

    Variance decomposition of protein profiles from antibody arrays using a longitudinal twin model

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The advent of affinity-based proteomics technologies for global protein profiling provides the prospect of finding new molecular biomarkers for common, multifactorial disorders. The molecular phenotypes obtained from studies on such platforms are driven by multiple sources, including genetic, environmental, and experimental components. In characterizing the contribution of different sources of variation to the measured phenotypes, the aim is to facilitate the design and interpretation of future biomedical studies employing exploratory and multiplexed technologies. Thus, biometrical genetic modelling of twin or other family data can be used to decompose the variation underlying a phenotype into biological and experimental components.</p> <p>Results</p> <p>Using antibody suspension bead arrays and antibodies from the Human Protein Atlas, we study unfractionated serum from a longitudinal study on 154 twins. In this study, we provide a detailed description of how the variation in a molecular phenotype in terms of protein profile can be decomposed into familial i.e. genetic and common environmental; individual environmental, short-term biological and experimental components. The results show that across 69 antibodies analyzed in the study, the median proportion of the total variation explained by familial sources is 12% (IQR 1-22%), and the median proportion of the total variation attributable to experimental sources is 63% (IQR 53-72%).</p> <p>Conclusion</p> <p>The variability analysis of antibody arrays highlights the importance to consider variability components and their relative contributions when designing and evaluating studies for biomarker discoveries with exploratory, high-throughput and multiplexed methods.</p
    • …
    corecore