2,027 research outputs found
Mouse obesity network reconstruction with a variational Bayes algorithm to employ aggressive false positive control
<p>Abstract</p> <p>Background</p> <p>We propose a novel variational Bayes network reconstruction algorithm to extract the most relevant disease factors from high-throughput genomic data-sets. Our algorithm is the only scalable method for regularized network recovery that employs Bayesian model averaging and that can internally estimate an appropriate level of sparsity to ensure few false positives enter the model without the need for cross-validation or a model selection criterion. We use our algorithm to characterize the effect of genetic markers and liver gene expression traits on mouse obesity related phenotypes, including weight, cholesterol, glucose, and free fatty acid levels, in an experiment previously used for discovery and validation of network connections: an F2 intercross between the C57BL/6 J and C3H/HeJ mouse strains, where apolipoprotein E is null on the background.</p> <p>Results</p> <p>We identified eleven genes, Gch1, Zfp69, Dlgap1, Gna14, Yy1, Gabarapl1, Folr2, Fdft1, Cnr2, Slc24a3, and Ccl19, and a quantitative trait locus directly connected to weight, glucose, cholesterol, or free fatty acid levels in our network. None of these genes were identified by other network analyses of this mouse intercross data-set, but all have been previously associated with obesity or related pathologies in independent studies. In addition, through both simulations and data analysis we demonstrate that our algorithm achieves superior performance in terms of power and type I error control than other network recovery algorithms that use the lasso and have bounds on type I error control.</p> <p>Conclusions</p> <p>Our final network contains 118 previously associated and novel genes affecting weight, cholesterol, glucose, and free fatty acid levels that are excellent obesity risk candidates.</p
Variational Downscaling, Fusion and Assimilation of Hydrometeorological States via Regularized Estimation
Improved estimation of hydrometeorological states from down-sampled
observations and background model forecasts in a noisy environment, has been a
subject of growing research in the past decades. Here, we introduce a unified
framework that ties together the problems of downscaling, data fusion and data
assimilation as ill-posed inverse problems. This framework seeks solutions
beyond the classic least squares estimation paradigms by imposing proper
regularization, which are constraints consistent with the degree of smoothness
and probabilistic structure of the underlying state. We review relevant
regularization methods in derivative space and extend classic formulations of
the aforementioned problems with particular emphasis on hydrologic and
atmospheric applications. Informed by the statistical characteristics of the
state variable of interest, the central results of the paper suggest that
proper regularization can lead to a more accurate and stable recovery of the
true state and hence more skillful forecasts. In particular, using the Tikhonov
and Huber regularization in the derivative space, the promise of the proposed
framework is demonstrated in static downscaling and fusion of synthetic
multi-sensor precipitation data, while a data assimilation numerical experiment
is presented using the heat equation in a variational setting
Recommended from our members
Generative Modeling and Inference in Directed and Undirected Neural Networks
Generative modeling and inference are two broad categories in unsupervised learning whose goal is to answer the following questions, respectively: 1. Given a dataset, how do we (either implicitly or explicitly) model the underlying probability distribution from which the data came and draw samples from that distribution? 2. How can we learn an underlying abstract representation of the data? In this dissertation we provide three studies that each in a different way improve upon specific generative modeling and inference techniques. First, we develop a state-of-the-art estimator of a generic probability distribution's partition function, or normalizing constant, during simulated tempering. We then apply our estimator to the specific case of training undirected probabilistic graphical models and find our method able to track log-likelihoods during training at essentially no extra computational cost. We then shift our focus to variational inference in directed probabilistic graphical models (Bayesian networks) for generative modeling and inference. First, we generalize the aggregate prior distribution to decouple the variational and generative models to provide the model with greater flexibility and find improvements in the model's log-likelihood of test data as well as a better latent representation. Finally, we study the variational loss function and argue under a typical architecture the data-dependent term of the gradient decays to zero as the latent space dimensionality increases. We use this result to propose a simple modification to random weight initialization and show in certain models the modification gives rise to substantial improvement in training convergence time. Together, these results improve quantitative performance of popular generative modeling and inference models in addition to furthering our understanding of them
- …