10 research outputs found

    Residual Component Analysis

    Get PDF
    Probabilistic principal component analysis (PPCA) seeks a low dimensional representation of a data set in the presence of independent spherical Gaussian noise, Sigma = (sigma^2)*I. The maximum likelihood solution for the model is an eigenvalue problem on the sample covariance matrix. In this paper we consider the situation where the data variance is already partially explained by other factors, e.g. covariates of interest, or temporal correlations leaving some residual variance. We decompose the residual variance into its components through a generalized eigenvalue problem, which we call residual component analysis (RCA). We show that canonical covariates analysis (CCA) is a special case of our algorithm and explore a range of new algorithms that arise from the framework. We illustrate the ideas on a gene expression time series data set and the recovery of human pose from silhouette

    Probabilistic Super-Resolution of Solar Magnetograms: Generating Many Explanations and Measuring Uncertainties

    Get PDF
    Machine learning techniques have been successfully applied to super-resolution tasks on natural images where visually pleasing results are sufficient. However in many scientific domains this is not adequate and estimations of errors and uncertainties are crucial. To address this issue we propose a Bayesian framework that decomposes uncertainties into epistemic and aleatoric uncertainties. We test the validity of our approach by super-resolving images of the Sun's magnetic field and by generating maps measuring the range of possible high resolution explanations compatible with a given low resolution magnetogram

    Single-Frame Super-Resolution of Solar Magnetograms: Investigating Physics-Based Metrics \& Losses

    Get PDF
    Breakthroughs in our understanding of physical phenomena have traditionally followed improvements in instrumentation. Studies of the magnetic field of the Sun, and its influence on the solar dynamo and space weather events, have benefited from improvements in resolution and measurement frequency of new instruments. However, in order to fully understand the solar cycle, high-quality data across time-scales longer than the typical lifespan of a solar instrument are required. At the moment, discrepancies between measurement surveys prevent the combined use of all available data. In this work, we show that machine learning can help bridge the gap between measurement surveys by learning to \textbf{super-resolve} low-resolution magnetic field images and \textbf{translate} between characteristics of contemporary instruments in orbit. We also introduce the notion of physics-based metrics and losses for super-resolution to preserve underlying physics and constrain the solution space of possible super-resolution outputs

    A simple approach to ranking differentially expressed gene expression time courses through Gaussian process regression.

    Get PDF
    BACKGROUND: The analysis of gene expression from time series underpins many biological studies. Two basic forms of analysis recur for data of this type: removing inactive (quiet) genes from the study and determining which genes are differentially expressed. Often these analysis stages are applied disregarding the fact that the data is drawn from a time series. In this paper we propose a simple model for accounting for the underlying temporal nature of the data based on a Gaussian process. RESULTS: We review Gaussian process (GP) regression for estimating the continuous trajectories underlying in gene expression time-series. We present a simple approach which can be used to filter quiet genes, or for the case of time series in the form of expression ratios, quantify differential expression. We assess via ROC curves the rankings produced by our regression framework and compare them to a recently proposed hierarchical Bayesian model for the analysis of gene expression time-series (BATS). We compare on both simulated and experimental data showing that the proposed approach considerably outperforms the current state of the art. CONCLUSIONS: Gaussian processes offer an attractive trade-off between efficiency and usability for the analysis of microarray time series. The Gaussian process framework offers a natural way of handling biological replicates and missing values and provides confidence intervals along the estimated curves of gene expression. Therefore, we believe Gaussian processes should be a standard tool in the analysis of gene expression time series

    gprege Quick Guide

    No full text
    The gprege package implements our methodology of Gaussian process regression models for the analysis of microarray time series. The package can be used to filter quiet genes and quantify/rank differential expression in time-series expression ratios. The purpose of this quick guide is to present a few examples of using gprege. 2 Citing gprege Citing gprege in publications will involve citing the methodology paper [Kalaitzis and Lawrence, 2011] that the software is based on as well as citing the software package itself. 3 Introductory example analysis- TP63 microarray data In this section we introduce the main functions of the gprege package by repeating some of the analysis from the BMC Bioinformatics paper [Kalaitzis and Lawrence, 2011]. 3.1 Installing the gprege package The recommended way to install gprege is to use the biocLite function available from the bioconductor website. Installing in this way should ensure that all appropriate dependencies are met
    corecore