18,750 research outputs found
Post-transcriptional knowledge in pathway analysis increases the accuracy of phenotypes classification
Motivation: Prediction of phenotypes from high-dimensional data is a crucial
task in precision biology and medicine. Many technologies employ genomic
biomarkers to characterize phenotypes. However, such elements are not
sufficient to explain the underlying biology. To improve this, pathway analysis
techniques have been proposed. Nevertheless, such methods have shown lack of
accuracy in phenotypes classification. Results: Here we propose a novel
methodology called MITHrIL (Mirna enrIched paTHway Impact anaLysis) for the
analysis of signaling pathways, which has built on top of the work of Tarca et
al., 2009. MITHrIL extends pathways by adding missing regulatory elements, such
as microRNAs, and their interactions with genes. The method takes as input the
expression values of genes and/or microRNAs and returns a list of pathways
sorted according to their deregulation degree, together with the corresponding
statistical significance (p-values). Our analysis shows that MITHrIL
outperforms its competitors even in the worst case. In addition, our method is
able to correctly classify sets of tumor samples drawn from TCGA. Availability:
MITHrIL is freely available at the following URL:
http://alpha.dmi.unict.it/mithril
How to understand the cell by breaking it: network analysis of gene perturbation screens
Modern high-throughput gene perturbation screens are key technologies at the
forefront of genetic research. Combined with rich phenotypic descriptors they
enable researchers to observe detailed cellular reactions to experimental
perturbations on a genome-wide scale. This review surveys the current
state-of-the-art in analyzing perturbation screens from a network point of
view. We describe approaches to make the step from the parts list to the wiring
diagram by using phenotypes for network inference and integrating them with
complementary data sources. The first part of the review describes methods to
analyze one- or low-dimensional phenotypes like viability or reporter activity;
the second part concentrates on high-dimensional phenotypes showing global
changes in cell morphology, transcriptome or proteome.Comment: Review based on ISMB 2009 tutorial; after two rounds of revisio
Bayesian Model Selection in Complex Linear Systems, as Illustrated in Genetic Association Studies
Motivated by examples from genetic association studies, this paper considers
the model selection problem in a general complex linear model system and in a
Bayesian framework. We discuss formulating model selection problems and
incorporating context-dependent {\it a priori} information through different
levels of prior specifications. We also derive analytic Bayes factors and their
approximations to facilitate model selection and discuss their theoretical and
computational properties. We demonstrate our Bayesian approach based on an
implemented Markov Chain Monte Carlo (MCMC) algorithm in simulations and a real
data application of mapping tissue-specific eQTLs. Our novel results on Bayes
factors provide a general framework to perform efficient model comparisons in
complex linear model systems
Unsupervised empirical Bayesian multiple testing with external covariates
In an empirical Bayesian setting, we provide a new multiple testing method,
useful when an additional covariate is available, that influences the
probability of each null hypothesis being true. We measure the posterior
significance of each test conditionally on the covariate and the data, leading
to greater power. Using covariate-based prior information in an unsupervised
fashion, we produce a list of significant hypotheses which differs in length
and order from the list obtained by methods not taking covariate-information
into account. Covariate-modulated posterior probabilities of each null
hypothesis are estimated using a fast approximate algorithm. The new method is
applied to expression quantitative trait loci (eQTL) data.Comment: Published in at http://dx.doi.org/10.1214/08-AOAS158 the Annals of
Applied Statistics (http://www.imstat.org/aoas/) by the Institute of
Mathematical Statistics (http://www.imstat.org
An integrative, multi-scale, genome-wide model reveals the phenotypic landscape of Escherichia coli.
Given the vast behavioral repertoire and biological complexity of even the simplest organisms, accurately predicting phenotypes in novel environments and unveiling their biological organization is a challenging endeavor. Here, we present an integrative modeling methodology that unifies under a common framework the various biological processes and their interactions across multiple layers. We trained this methodology on an extensive normalized compendium for the gram-negative bacterium Escherichia coli, which incorporates gene expression data for genetic and environmental perturbations, transcriptional regulation, signal transduction, and metabolic pathways, as well as growth measurements. Comparison with measured growth and high-throughput data demonstrates the enhanced ability of the integrative model to predict phenotypic outcomes in various environmental and genetic conditions, even in cases where their underlying functions are under-represented in the training set. This work paves the way toward integrative techniques that extract knowledge from a variety of biological data to achieve more than the sum of their parts in the context of prediction, analysis, and redesign of biological systems
Automated design of bacterial genome sequences
Background:
Organisms have evolved ways of regulating transcription to better adapt to varying environments. Could the current functional genomics data and models support the possibility of engineering a genome with completely rearranged gene organization while the cell maintains its behavior under environmental challenges? How would we proceed to design a full nucleotide sequence for such genomes?
Results:
As a first step towards answering such questions, recent work showed that it is possible to design alternative transcriptomic models showing the same behavior under environmental variations than the wild-type model. A second step would require providing evidence that it is possible to provide a nucleotide sequence for a genome encoding such transcriptional model. We used computational design techniques to design a rewired global transcriptional regulation of Escherichia coli, yet showing a similar transcriptomic response than the wild-type. Afterwards, we “compiled” the transcriptional networks into nucleotide sequences to obtain the final genome sequence. Our computational evolution procedure ensures that we can maintain the genotype-phenotype mapping during the rewiring of the regulatory network. We found that it is theoretically possible to reorganize E. coli genome into 86% fewer regulated operons. Such refactored genomes are constituted by operons that contain sets of genes sharing around the 60% of their biological functions and, if evolved under highly variable environmental conditions, have regulatory networks, which turn out to respond more than 20% faster to multiple external perturbations.
Conclusions:
This work provides the first algorithm for producing a genome sequence encoding a rewired transcriptional regulation with wild-type behavior under alternative environments
- …