63,721 research outputs found
A factor model to analyze heterogeneity in gene expression
<p>Abstract</p> <p>Background</p> <p>Microarray technology allows the simultaneous analysis of thousands of genes within a single experiment. Significance analyses of transcriptomic data ignore the gene dependence structure. This leads to correlation among test statistics which affects a strong control of the false discovery proportion. A recent method called FAMT allows capturing the gene dependence into factors in order to improve high-dimensional multiple testing procedures. In the subsequent analyses aiming at a functional characterization of the differentially expressed genes, our study shows how these factors can be used both to identify the components of expression heterogeneity and to give more insight into the underlying biological processes.</p> <p>Results</p> <p>The use of factors to characterize simple patterns of heterogeneity is first demonstrated on illustrative gene expression data sets. An expression data set primarily generated to map QTL for fatness in chickens is then analyzed. Contrarily to the analysis based on the raw data, a relevant functional information about a QTL region is revealed by factor-adjustment of the gene expressions. Additionally, the interpretation of the independent factors regarding known information about both experimental design and genes shows that some factors may have different and complex origins.</p> <p>Conclusions</p> <p>As biological information and technological biases are identified in what was before simply considered as statistical noise, analyzing heterogeneity in gene expression yields a new point of view on transcriptomic data.</p
Between-groups within-gene heterogeneity of residual variances in microarray gene expression data
<p>Abstract</p> <p>Background</p> <p>The analysis of microarray gene expression data typically tries to identify differential gene expression patterns in terms of differences of the mathematical expectation between groups of arrays (e.g. treatments or biological conditions). Nevertheless, the differential expression pattern could also be characterized by group-specific dispersion patterns, although little is known about this phenomenon in microarray data. Commonly, a homogeneous gene-specific residual variance is assumed in hierarchical mixed models for gene expression data, although it could result in substantial biases if this assumption is not true.</p> <p>Results</p> <p>In this manuscript, a hierarchical mixed model with within-gene heterogeneous residual variances is proposed to analyze gene expression data from non-competitive hybridized microarrays. Moreover, a straightforward Bayes factor is adapted to easily check within-gene (between groups) heterogeneity of residual variances when samples are grouped in two different treatments. This Bayes factor only requires the analysis of the complex model (hierarchical mixed model with between-groups heterogeneous residual variances for all analyzed genes) and gene-specific Bayes factors are provided from the output of a simple Markov chain Monte Carlo sampling.</p> <p>Conclusion</p> <p>This statistical development opens new research possibilities within the gene expression framework, where heterogeneity in residual variability could be viewed as an alternative and plausible characterization of differential expression patterns.</p
A statistical framework for joint eQTL analysis in multiple tissues
Mapping expression Quantitative Trait Loci (eQTLs) represents a powerful and
widely-adopted approach to identifying putative regulatory variants and linking
them to specific genes. Up to now eQTL studies have been conducted in a
relatively narrow range of tissues or cell types. However, understanding the
biology of organismal phenotypes will involve understanding regulation in
multiple tissues, and ongoing studies are collecting eQTL data in dozens of
cell types. Here we present a statistical framework for powerfully detecting
eQTLs in multiple tissues or cell types (or, more generally, multiple
subgroups). The framework explicitly models the potential for each eQTL to be
active in some tissues and inactive in others. By modeling the sharing of
active eQTLs among tissues this framework increases power to detect eQTLs that
are present in more than one tissue compared with "tissue-by-tissue" analyses
that examine each tissue separately. Conversely, by modeling the inactivity of
eQTLs in some tissues, the framework allows the proportion of eQTLs shared
across different tissues to be formally estimated as parameters of a model,
addressing the difficulties of accounting for incomplete power when comparing
overlaps of eQTLs identified by tissue-by-tissue analyses. Applying our
framework to re-analyze data from transformed B cells, T cells and fibroblasts
we find that it substantially increases power compared with tissue-by-tissue
analysis, identifying 63% more genes with eQTLs (at FDR=0.05). Further the
results suggest that, in contrast to previous analyses of the same data, the
majority of eQTLs detectable in these data are shared among all three tissues.Comment: Summitted to PLoS Genetic
Simulating Brain Tumor Heterogeneity with a Multiscale Agent-Based Model: Linking Molecular Signatures, Phenotypes and Expansion Rate
We have extended our previously developed 3D multi-scale agent-based brain
tumor model to simulate cancer heterogeneity and to analyze its impact across
the scales of interest. While our algorithm continues to employ an epidermal
growth factor receptor (EGFR) gene-protein interaction network to determine the
cells' phenotype, it now adds an explicit treatment of tumor cell adhesion
related to the model's biochemical microenvironment. We simulate a simplified
tumor progression pathway that leads to the emergence of five distinct glioma
cell clones with different EGFR density and cell 'search precisions'. The in
silico results show that microscopic tumor heterogeneity can impact the tumor
system's multicellular growth patterns. Our findings further confirm that EGFR
density results in the more aggressive clonal populations switching earlier
from proliferation-dominated to a more migratory phenotype. Moreover, analyzing
the dynamic molecular profile that triggers the phenotypic switch between
proliferation and migration, our in silico oncogenomics data display spatial
and temporal diversity in documenting the regional impact of tumorigenesis, and
thus support the added value of multi-site and repeated assessments in vitro
and in vivo. Potential implications from this in silico work for experimental
and computational studies are discussed.Comment: 37 pages, 10 figure
Gene expression in large pedigrees: analytic approaches.
BackgroundWe currently have the ability to quantify transcript abundance of messenger RNA (mRNA), genome-wide, using microarray technologies. Analyzing genotype, phenotype and expression data from 20 pedigrees, the members of our Genetic Analysis Workshop (GAW) 19 gene expression group published 9 papers, tackling some timely and important problems and questions. To study the complexity and interrelationships of genetics and gene expression, we used established statistical tools, developed newer statistical tools, and developed and applied extensions to these tools.MethodsTo study gene expression correlations in the pedigree members (without incorporating genotype or trait data into the analysis), 2 papers used principal components analysis, weighted gene coexpression network analysis, meta-analyses, gene enrichment analyses, and linear mixed models. To explore the relationship between genetics and gene expression, 2 papers studied expression quantitative trait locus allelic heterogeneity through conditional association analyses, and epistasis through interaction analyses. A third paper assessed the feasibility of applying allele-specific binding to filter potential regulatory single-nucleotide polymorphisms (SNPs). Analytic approaches included linear mixed models based on measured genotypes in pedigrees, permutation tests, and covariance kernels. To incorporate both genotype and phenotype data with gene expression, 4 groups employed linear mixed models, nonparametric weighted U statistics, structural equation modeling, Bayesian unified frameworks, and multiple regression.Results and discussionRegarding the analysis of pedigree data, we found that gene expression is familial, indicating that at least 1 factor for pedigree membership or multiple factors for the degree of relationship should be included in analyses, and we developed a method to adjust for familiality prior to conducting weighted co-expression gene network analysis. For SNP association and conditional analyses, we found FaST-LMM (Factored Spectrally Transformed Linear Mixed Model) and SOLAR-MGA (Sequential Oligogenic Linkage Analysis Routines -Major Gene Analysis) have similar type 1 and type 2 errors and can be used almost interchangeably. To improve the power and precision of association tests, prior knowledge of DNase-I hypersensitivity sites or other relevant biological annotations can be incorporated into the analyses. On a biological level, eQTL (expression quantitative trait loci) are genetically complex, exhibiting both allelic heterogeneity and epistasis. Including both genotype and phenotype data together with measurements of gene expression was found to be generally advantageous in terms of generating improved levels of significance and in providing more interpretable biological models.ConclusionsPedigrees can be used to conduct analyses of and enhance gene expression studies
Estimating sample-specific regulatory networks
Biological systems are driven by intricate interactions among the complex
array of molecules that comprise the cell. Many methods have been developed to
reconstruct network models of those interactions. These methods often draw on
large numbers of samples with measured gene expression profiles to infer
connections between genes (or gene products). The result is an aggregate
network model representing a single estimate for the likelihood of each
interaction, or "edge," in the network. While informative, aggregate models
fail to capture the heterogeneity that is represented in any population. Here
we propose a method to reverse engineer sample-specific networks from aggregate
network models. We demonstrate the accuracy and applicability of our approach
in several data sets, including simulated data, microarray expression data from
synchronized yeast cells, and RNA-seq data collected from human lymphoblastoid
cell lines. We show that these sample-specific networks can be used to study
changes in network topology across time and to characterize shifts in gene
regulation that may not be apparent in expression data. We believe the ability
to generate sample-specific networks will greatly facilitate the application of
network methods to the increasingly large, complex, and heterogeneous
multi-omic data sets that are currently being generated, and ultimately support
the emerging field of precision network medicine
Synchronization and entrainment of coupled circadian oscillators
Circadian rhythms in mammals are controlled by the neurons located in the
suprachiasmatic nucleus of the hypothalamus. In physiological conditions, the
system of neurons is very efficiently entrained by the 24-hour light-dark
cycle. Most of the studies carried out so far emphasize the crucial role of the
periodicity imposed by the light dark cycle in neuronal synchronization.
Nevertheless, heterogeneity as a natural and permanent ingredient of these
cellular interactions is seemingly to play a major role in these biochemical
processes. In this paper we use a model that considers the neurons of the
suprachiasmatic nucleus as chemically-coupled modified Goodwin oscillators, and
introduce non-negligible heterogeneity in the periods of all neurons in the
form of quenched noise. The system response to the light-dark cycle periodicity
is studied as a function of the interneuronal coupling strength, external
forcing amplitude and neuronal heterogeneity. Our results indicate that the
right amount of heterogeneity helps the extended system to respond globally in
a more coherent way to the external forcing. Our proposed mechanism for
neuronal synchronization under external periodic forcing is based on
heterogeneity-induced oscillators death, damped oscillators being more
entrainable by the external forcing than the self-oscillating neurons with
different periods.Comment: 17 pages, 7 figure
Recommended from our members
1458 EMT-inhibiting transcription factor Ovol2 regulates directional cell migration and proliferation in adult skin epithelia
- …