83 research outputs found
Somatic sex-specific transcriptome differences in Drosophila revealed by whole transcriptome sequencing
<p>Abstract</p> <p>Background</p> <p>Understanding animal development and physiology at a molecular-biological level has been advanced by the ability to determine at high resolution the repertoire of mRNA molecules by whole transcriptome resequencing. This includes the ability to detect and quantify rare abundance transcripts and isoform-specific mRNA variants produced from a gene.</p> <p>The sex hierarchy consists of a pre-mRNA splicing cascade that directs the production of sex-specific transcription factors that specify nearly all sexual dimorphism. We have used deep RNA sequencing to gain insight into how the Drosophila sex hierarchy generates somatic sex differences, by examining gene and transcript isoform expression differences between the sexes in adult head tissues.</p> <p>Results</p> <p>Here we find 1,381 genes that differ in overall expression levels and 1,370 isoform-specific transcripts that differ between males and females. Additionally, we find 512 genes not regulated downstream of <it>transformer </it>that are significantly more highly expressed in males than females. These 512 genes are enriched on the × chromosome and reside adjacent to dosage compensation complex entry sites, which taken together suggests that their residence on the × chromosome might be sufficient to confer male-biased expression. There are no transcription unit structural features, from a set of features, that are robustly significantly different in the genes with significant sex differences in the ratio of isoform-specific transcripts, as compared to random isoform-specific transcripts, suggesting that there is no single molecular mechanism that generates isoform-specific transcript differences between the sexes, even though the sex hierarchy is known to include three pre-mRNA splicing factors.</p> <p>Conclusions</p> <p>We identify thousands of genes that show sex-specific differences in overall gene expression levels, and identify hundreds of additional genes that have differences in the abundance of isoform-specific transcripts. No transcription unit structural feature was robustly enriched in the sex-differentially expressed transcript isoforms. Additionally, we found that many genes with male-biased expression were enriched on the × chromosome and reside adjacent to dosage compensation entry sites, suggesting that differences in sex chromosome composition contributes to dimorphism in gene expression. Taken together, this study provides new insight into the molecular underpinnings of sexual differentiation.</p
Detection of Perturbation Phases and Developmental Stages in Organisms from DNA Microarray Time Series Data
Available DNA microarray time series that record gene expression along the developmental stages of multicellular eukaryotes, or in unicellular organisms subject to external perturbations such as stress and diauxie, are analyzed. By pairwise comparison of the gene expression profiles on the basis of a translation-invariant and scale-invariant distance measure corresponding to least-rectangle regression, it is shown that peaks in the average distance values are noticeable and are localized around specific time points. These points systematically coincide with the transition points between developmental phases or just follow the external perturbations. This approach can thus be used to identify automatically, from microarray time series alone, the presence of external perturbations or the succession of developmental stages in arbitrary cell systems. Moreover, our results show that there is a striking similarity between the gene expression responses to these a priori very different phenomena. In contrast, the cell cycle does not involve a perturbation-like phase, but rather continuous gene expression remodeling. Similar analyses were conducted using three other standard distance measures, showing that the one we introduced was superior. Based on these findings, we set up an adapted clustering method that uses this distance measure and classifies the genes on the basis of their expression profiles within each developmental stage or between perturbation phases
Constructing non-stationary Dynamic Bayesian Networks with a flexible lag choosing mechanism
<p>Abstract</p> <p>Background</p> <p>Dynamic Bayesian Networks (DBNs) are widely used in regulatory network structure inference with gene expression data. Current methods assumed that the underlying stochastic processes that generate the gene expression data are stationary. The assumption is not realistic in certain applications where the intrinsic regulatory networks are subject to changes for adapting to internal or external stimuli.</p> <p>Results</p> <p>In this paper we investigate a novel non-stationary DBNs method with a potential regulator detection technique and a flexible lag choosing mechanism. We apply the approach for the gene regulatory network inference on three non-stationary time series data. For the Macrophages and Arabidopsis data sets with the reference networks, our method shows better network structure prediction accuracy. For the Drosophila data set, our approach converges faster and shows a better prediction accuracy on transition times. In addition, our reconstructed regulatory networks on the Drosophila data not only share a lot of similarities with the predictions of the work of other researchers but also provide many new structural information for further investigation.</p> <p>Conclusions</p> <p>Compared with recent proposed non-stationary DBNs methods, our approach has better structure prediction accuracy By detecting potential regulators, our method reduces the size of the search space, hence may speed up the convergence of MCMC sampling.</p
Mining biological information from 3D short time-series gene expression data: the OPTricluster algorithm
<p>Abstract</p> <p>Background</p> <p>Nowadays, it is possible to collect expression levels of a set of genes from a set of biological samples during a series of time points. Such data have three dimensions: gene-sample-time (GST). Thus they are called 3D microarray gene expression data. To take advantage of the 3D data collected, and to fully understand the biological knowledge hidden in the GST data, novel subspace clustering algorithms have to be developed to effectively address the biological problem in the corresponding space.</p> <p>Results</p> <p>We developed a subspace clustering algorithm called Order Preserving Triclustering (OPTricluster), for 3D short time-series data mining. OPTricluster is able to identify 3D clusters with coherent evolution from a given 3D dataset using a combinatorial approach on the sample dimension, and the order preserving (OP) concept on the time dimension. The fusion of the two methodologies allows one to study similarities and differences between samples in terms of their temporal expression profile. OPTricluster has been successfully applied to four case studies: immune response in mice infected by malaria (<it>Plasmodium chabaudi</it>), systemic acquired resistance in <it>Arabidopsis thaliana</it>, similarities and differences between inner and outer cotyledon in <it>Brassica napus </it>during seed development, and to <it>Brassica napus </it>whole seed development. These studies showed that OPTricluster is robust to noise and is able to detect the similarities and differences between biological samples.</p> <p>Conclusions</p> <p>Our analysis showed that OPTricluster generally outperforms other well known clustering algorithms such as the TRICLUSTER, gTRICLUSTER and K-means; it is robust to noise and can effectively mine the biological knowledge hidden in the 3D short time-series gene expression data.</p
DroID: the Drosophila Interactions Database, a comprehensive resource for annotated gene and protein interactions
<p>Abstract</p> <p>Background</p> <p>Charting the interactions among genes and among their protein products is essential for understanding biological systems. A flood of interaction data is emerging from high throughput technologies, computational approaches, and literature mining methods. Quick and efficient access to this data has become a critical issue for biologists. Several excellent multi-organism databases for gene and protein interactions are available, yet most of these have understandable difficulty maintaining comprehensive information for any one organism. No single database, for example, includes all available interactions, integrated gene expression data, and comprehensive and searchable gene information for the important model organism, <it>Drosophila melanogaster</it>.</p> <p>Description</p> <p>DroID, the <it>Drosophila </it>Interactions Database, is a comprehensive interactions database designed specifically for <it>Drosophila</it>. DroID houses published physical protein interactions, genetic interactions, and computationally predicted interactions, including interologs based on data for other model organisms and humans. All interactions are annotated with original experimental data and source information. DroID can be searched and filtered based on interaction information or a comprehensive set of gene attributes from Flybase. DroID also contains gene expression and expression correlation data that can be searched and used to filter datasets, for example, to focus a study on sub-networks of co-expressed genes. To address the inherent noise in interaction data, DroID employs an updatable confidence scoring system that assigns a score to each physical interaction based on the likelihood that it represents a biologically significant link.</p> <p>Conclusion</p> <p>DroID is the most comprehensive interactions database available for <it>Drosophila</it>. To facilitate downstream analyses, interactions are annotated with original experimental information, gene expression data, and confidence scores. All data in DroID are freely available and can be searched, explored, and downloaded through three different interfaces, including a text based web site, a Java applet with dynamic graphing capabilities (IM Browser), and a Cytoscape plug-in. DroID is available at <url>http://www.droidb.org</url>.</p
Automatic Annotation of Spatial Expression Patterns via Sparse Bayesian Factor Models
Advances in reporters for gene expression have made it possible to document and quantify expression patterns in 2D–4D. In contrast to microarrays, which provide data for many genes but averaged and/or at low resolution, images reveal the high spatial dynamics of gene expression. Developing computational methods to compare, annotate, and model gene expression based on images is imperative, considering that available data are rapidly increasing. We have developed a sparse Bayesian factor analysis model in which the observed expression diversity of among a large set of high-dimensional images is modeled by a small number of hidden common factors. We apply this approach on embryonic expression patterns from a Drosophila RNA in situ image database, and show that the automatically inferred factors provide for a meaningful decomposition and represent common co-regulation or biological functions. The low-dimensional set of factor mixing weights is further used as features by a classifier to annotate expression patterns with functional categories. On human-curated annotations, our sparse approach reaches similar or better classification of expression patterns at different developmental stages, when compared to other automatic image annotation methods using thousands of hard-to-interpret features. Our study therefore outlines a general framework for large microscopy data sets, in which both the generative model itself, as well as its application for analysis tasks such as automated annotation, can provide insight into biological questions
Gene expression throughout a vertebrate's embryogenesis
Abstract Background Describing the patterns of gene expression during embryonic development has broadened our understanding of the processes and patterns that define morphogenesis. Yet gene expression patterns have not been described throughout vertebrate embryogenesis. This study presents statistical analyses of gene expression during all 40 developmental stages in the teleost Fundulus heteroclitus using four biological replicates per stage. Results Patterns of gene expression for 7,000 genes appear to be important as they recapitulate developmental timing. Among the 45% of genes with significant expression differences between pairs of temporally adjacent stages, significant differences in gene expression vary from as few as five to more than 660. Five adjacent stages have disproportionately more significant changes in gene expression (> 200 genes) relative to other stages: four to eight and eight to sixteen cell stages, onset of circulation, pre and post-hatch, and during complete yolk absorption. The fewest differences among adjacent stages occur during gastrulation. Yet, at stage 16, (pre-mid-gastrulation) the largest number of genes has peak expression. This stage has an over representation of genes in oxidative respiration and protein expression (ribosomes, translational genes and proteases). Unexpectedly, among all ribosomal genes, both strong positive and negative correlations occur. Similar correlated patterns of expression occur among all significant genes. Conclusions These data provide statistical support for the temporal dynamics of developmental gene expression during all stages of vertebrate development
Semi-supervised learning for the identification of syn-expressed genes from fused microarray and in situ image data
Background:
Gene expression measurements during the development of the fly Drosophila melanogaster are routinely used to find functional modules of temporally co-expressed genes. Complimentary large data sets of in situ RNA hybridization images for different stages of the fly embryo elucidate the spatial expression patterns.
Results:
Using a semi-supervised approach, constrained clustering with mixture models, we can find clusters of genes exhibiting spatio-temporal similarities in expression, or syn-expression. The temporal gene expression measurements are taken as primary data for which pairwise constraints are computed in an automated fashion from raw in situ images without the need for manual annotation. We investigate the influence of these pairwise constraints in the clustering and discuss the biological relevance of our results.
Conclusion:
Spatial information contributes to a detailed, biological meaningful analysis of temporal gene expression data. Semi-supervised learning provides a flexible, robust and efficient framework for integrating data sources of differing quality and abundance
Global Gene Expression Analysis of Murine Limb Development
Detailed information about stage-specific changes in gene expression is crucial for understanding the gene regulatory networks underlying development and the various signal transduction pathways contributing to morphogenesis. Here we describe the global gene expression dynamics during early murine limb development, when cartilage, tendons, muscle, joints, vasculature and nerves are specified and the musculoskeletal system of limbs is established. We used whole-genome microarrays to identify genes with differential expression at 5 stages of limb development (E9.5 to 13.5), during fore- and hind-limb patterning. We found that the onset of limb formation is characterized by an up-regulation of transcription factors, which is followed by a massive activation of genes during E10.5 and E11.5 which levels off at later time points. Among the 3520 genes identified as significantly up-regulated in the limb, we find ∼30% to be novel, dramatically expanding the repertoire of candidate genes likely to function in the limb. Hierarchical and stage-specific clustering identified expression profiles that are likely to correlate with functional programs during limb development and further characterization of these transcripts will provide new insights into specific tissue patterning processes. Here, we provide for the first time a comprehensive analysis of developmentally regulated genes during murine limb development, and provide some novel insights into the expression dynamics governing limb morphogenesis
Molecular evidence for increased regulatory conservation during metamorphosis, and against deleterious cascading effects of hybrid breakdown in Drosophila
<p>Abstract</p> <p>Background</p> <p>Speculation regarding the importance of changes in gene regulation in determining major phylogenetic patterns continues to accrue, despite a lack of broad-scale comparative studies examining how patterns of gene expression vary during development. Comparative transcriptional profiling of adult interspecific hybrids and their parental species has uncovered widespread divergence of the mechanisms controlling gene regulation, revealing incompatibilities that are masked in comparisons between the pure species. However, this has prompted the suggestion that misexpression in adult hybrids results from the downstream cascading effects of a subset of genes improperly regulated in early development.</p> <p>Results</p> <p>We sought to determine how gene expression diverges over development, as well as test the cascade hypothesis, by profiling expression in males of <it>Drosophila melanogaster</it>, <it>D. sechellia</it>, and <it>D. simulans</it>, as well as the <it>D. simulans </it>(♀) × <it>D. sechellia </it>(♂) male F1 hybrids, at four different developmental time points (3rd instar larval, early pupal, late pupal, and newly-emerged adult). Contrary to the cascade model of misexpression, we find that there is considerable stage-specific autonomy of regulatory breakdown in hybrids, with the larval and adult stages showing significantly more hybrid misexpression as compared to the pupal stage. However, comparisons between pure species indicate that genes expressed during earlier stages of development tend to be more conserved in terms of their level of expression than those expressed during later stages, suggesting that while Von Baer's famous law applies at both the level of nucleotide sequence and expression, it may not apply necessarily to the underlying overall regulatory network, which appears to diverge over the course of ontogeny and which can only be ascertained by combining divergent genomes in species hybrids.</p> <p>Conclusion</p> <p>Our results suggest that complex integration of regulatory circuits during morphogenesis may lead to it being more refractory to divergence of underlying gene regulatory mechanisms - more than that suggested by the conservation of gene expression levels between species during earlier stages. This provides support for a 'developmental hourglass' model of divergence of gene expression in <it>Drosophila </it>resulting in a highly conserved pupal stage.</p
- …