11,602 research outputs found
Automated data integration for developmental biological research
In an era exploding with genome-scale data, a major challenge for developmental biologists is how to extract significant clues from these publicly available data to benefit our studies of individual genes, and how to use them to improve our understanding of development at a systems level. Several studies have successfully demonstrated new approaches to classic developmental questions by computationally integrating various genome-wide data sets. Such computational approaches have shown great potential for facilitating research: instead of testing 20,000 genes, researchers might test 200 to the same effect. We discuss the nature and state of this art as it applies to developmental research
Integration and mining of malaria molecular, functional and pharmacological data: how far are we from a chemogenomic knowledge space?
The organization and mining of malaria genomic and post-genomic data is
highly motivated by the necessity to predict and characterize new biological
targets and new drugs. Biological targets are sought in a biological space
designed from the genomic data from Plasmodium falciparum, but using also the
millions of genomic data from other species. Drug candidates are sought in a
chemical space containing the millions of small molecules stored in public and
private chemolibraries. Data management should therefore be as reliable and
versatile as possible. In this context, we examined five aspects of the
organization and mining of malaria genomic and post-genomic data: 1) the
comparison of protein sequences including compositionally atypical malaria
sequences, 2) the high throughput reconstruction of molecular phylogenies, 3)
the representation of biological processes particularly metabolic pathways, 4)
the versatile methods to integrate genomic data, biological representations and
functional profiling obtained from X-omic experiments after drug treatments and
5) the determination and prediction of protein structures and their molecular
docking with drug candidate structures. Progresses toward a grid-enabled
chemogenomic knowledge space are discussed.Comment: 43 pages, 4 figures, to appear in Malaria Journa
Bridging topological and functional information in protein interaction networks by short loops profiling
Protein-protein interaction networks (PPINs) have been employed to identify potential novel interconnections between proteins as well as crucial cellular functions. In this study we identify fundamental principles of PPIN topologies by analysing network motifs of short loops, which are small cyclic interactions of between 3 and 6 proteins. We compared 30 PPINs with corresponding randomised null models and examined the occurrence of common biological functions in loops extracted from a cross-validated high-confidence dataset of 622 human protein complexes. We demonstrate that loops are an intrinsic feature of PPINs and that specific cell functions are predominantly performed by loops of different lengths. Topologically, we find that loops are strongly related to the accuracy of PPINs and define a core of interactions with high resilience. The identification of this core and the analysis of loop composition are promising tools to assess PPIN quality and to uncover possible biases from experimental detection methods. More than 96% of loops share at least one biological function, with enrichment of cellular functions related to mRNA metabolic processing and the cell cycle. Our analyses suggest that these motifs can be used in the design of targeted experiments for functional phenotype detection.This research was supported by the Biotechnology and Biological Sciences Research Council (BB/H018409/1 to AP, ACCC and FF, and BB/J016284/1 to NSBT) and by the Leukaemia & Lymphoma Research (to NSBT and FF). SSC is funded by a Leukaemia & Lymphoma Research Gordon Piller PhD Studentship
The DAXX co-repressor is directly recruited to active regulatory elements genome-wide to regulate autophagy programs in a model of human prostate cancer.
While carcinoma of the prostate is the second most common cause of cancer death in the US, current methods and markers used to predict prostate cancer (PCa) outcome are inadequate. This study was aimed at understanding the genome-wide binding and regulatory role of the DAXX transcriptional repressor, recently implicated in PCa. ChIP-Seq analysis of genome-wide distribution of DAXX in PC3 cells revealed over 59,000 DAXX binding sites, found at regulatory enhancers and promoters. ChIP-Seq analysis of DNA methyltransferase 1 (DNMT1), which is a key epigenetic partner for DAXX repression, revealed that DNMT1 binding was restricted to a small number of DAXX sites. DNMT1 and DAXX bound close to transcriptional activator motifs. DNMT1 sites were found to be dependent on DAXX for recruitment by analyzing DNMT1 ChIP-Seq following DAXX knockdown (K/D), corroborating previous findings that DAXX recruits DNMT1 to repress its target genes. Massively parallel RNA sequencing (RNA-Seq) was used to compare the transcriptomes of WT and DAXX K/D PC3 cells. Genes induced by DAXX K/D included those involved in autophagy, and DAXX ChIP-Seq peaks were found close to the transcription start sites (TSS) of autophagy genes, implying they are more likely to be regulated by DAXX. In conclusion, DAXX binds active regulatory elements and co-localizes with DNMT1 in the prostate cancer genome. Given DAXX's putative regulatory role in autophagy, future studies may consider DAXX as a candidate marker and therapeutic target for prostate cancer
Crosstalk and the Dynamical Modularity of Feed-Forward Loops in Transcriptional Regulatory Networks
Network motifs, such as the feed-forward loop (FFL), introduce a range of complex behaviors to transcriptional regulatory networks, yet such properties are typically determined from their isolated study. We characterize the effects of crosstalk on FFL dynamics by modeling the cross regulation between two different FFLs and evaluate the extent to which these patterns occur in vivo. Analytical modeling suggests that crosstalk should overwhelmingly affect individual protein-expression dynamics. Counter to this expectation we find that entire FFLs are more likely than expected to resist the effects of crosstalk (approximate to 20% for one crosstalk interaction) and remain dynamically modular. The likelihood that cross-linked FFLs are dynamically correlated increases monotonically with additional crosstalk, but is independent of the specific regulation type or connectivity of the interactions. Just one additional regulatory interaction is sufficient to drive the FFL dynamics to a statistically different state. Despite the potential for modularity between sparsely connected network motifs, Escherichia coli (E. coli) appears to favor crosstalk wherein at least one of the cross-linked FFLs remains modular. A gene ontology analysis reveals that stress response processes are significantly overrepresented in the cross-linked motifs found within E. coli. Although the daunting complexity of biological networks affects the dynamical properties of individual network motifs, some resist and remain modular, seemingly insulated from extrinsic perturbations-an intriguing possibility for nature to consistently and reliably provide certain network functionalities wherever the need arise
An orchestrated intron retention program in meiosis controls timely usage of transcripts during germ cell differentiation
Global transcriptome reprogramming during sper-matogenesis ensures timely expression of factors in each phase of male germ cell differentiation. Sper-matocytes and spermatids require particularly exten-sive reprogramming of gene expression to switch from mitosis to meiosis and to support gamete morphogenesis. Here, we uncovered an extensive alternative splicing program during this transmeiotic differentiation. Notably, intron retention was largely the most enriched pattern, with spermatocytes showing generally higher levels of retention compared with spermatids. Retained introns are characterized by weak splice sites and are enriched in genes with strong relevance for gamete func-tion. Meiotic intron-retaining transcripts (IRTs) were exclusively localized in the nucleus. However, differ-ently from other developmentally regulated IRTs, they are stable RNAs, showing longer half-life than properly spliced transcripts. Strikingly, fate-mapping experiments revealed that IRTs are recruited onto polyribosomes days after synthesis. These studies reveal an unexpected function for regulated intron retention in modulation of the timely expression of select transcripts during spermatogenesis
Recommended from our members
Functionally Annotating Regulatory Elements in the Equine Genome Using Histone Mark ChIP-Seq.
One of the primary aims of the Functional Annotation of ANimal Genomes (FAANG) initiative is to characterize tissue-specific regulation within animal genomes. To this end, we used chromatin immunoprecipitation followed by sequencing (ChIP-Seq) to map four histone modifications (H3K4me1, H3K4me3, H3K27ac, and H3K27me3) in eight prioritized tissues collected as part of the FAANG equine biobank from two thoroughbred mares. Data were generated according to optimized experimental parameters developed during quality control testing. To ensure that we obtained sufficient ChIP and successful peak-calling, data and peak-calls were assessed using six quality metrics, replicate comparisons, and site-specific evaluations. Tissue specificity was explored by identifying binding motifs within unique active regions, and motifs were further characterized by gene ontology (GO) and protein-protein interaction analyses. The histone marks identified in this study represent some of the first resources for tissue-specific regulation within the equine genome. As such, these publicly available annotation data can be used to advance equine studies investigating health, performance, reproduction, and other traits of economic interest in the horse
Recommended from our members
Identification of the expressome by machine learning on omics data.
Accurate annotation of plant genomes remains complex due to the presence of many pseudogenes arising from whole-genome duplication-generated redundancy or the capture and movement of gene fragments by transposable elements. Machine learning on genome-wide epigenetic marks, informed by transcriptomic and proteomic training data, could be used to improve annotations through classification of all putative protein-coding genes as either constitutively silent or able to be expressed. Expressed genes were subclassified as able to express both mRNAs and proteins or only RNAs, and CG gene body methylation was associated only with the former subclass. More than 60,000 protein-coding genes have been annotated in the reference genome of maize inbred B73. About two-thirds of these genes are transcribed and are designated the filtered gene set (FGS). Classification of genes by our trained random forest algorithm was accurate and relied only on histone modifications or DNA methylation patterns within the gene body; promoter methylation was unimportant. Other inbred lines are known to transcribe significantly different sets of genes, indicating that the FGS is specific to B73. We accurately classified the sets of transcribed genes in additional inbred lines, arising from inbred-specific DNA methylation patterns. This approach highlights the potential of using chromatin information to improve annotations of functional genes
- …