13,918 research outputs found

    WormBase 2017: Molting into a new stage

    Get PDF

    Transcriptome-wide analysis reveals different categories of response to a standardised immune challenge in a wild rodent

    Get PDF
    Individuals vary in their immune response and, as a result, some are more susceptible to infectious disease than others. Little is known about the nature of this individual variation in natural populations, or which components of immune pathways are most responsible, but defining this underlying landscape of variation is an essential first step to understanding the drivers of this variation and, ultimately, predicting the outcome of infection. We describe transcriptome-wide variation in response to a standardised immune challenge in wild field voles. We find that markers can be categorised into a limited number of types. For the majority of markers, the response of an individual is dependent on its baseline expression level, with significant enrichment in this category for conventional immune pathways. Another, moderately sized, category contains markers for which the responses of different individuals are also variable but independent of their baseline expression levels. This category lacks any enrichment for conventional immune pathways. We further identify markers which display particularly high individual variability in response, and could be used as markers of immune response in larger studies. Our work shows how a standardised challenge performed on a natural population can reveal the patterns of natural variation in immune response

    Semi-automated framework for the analytical use of gene-centric data with biological ontologies

    Get PDF
    Motivation Translational bioinformatics(TBI) has been defined as ‘the development and application of informatics methods that connect molecular entities to clinical entities’ [1], which has emerged as a systems theory approach to bridge the huge wealth of biomedical data into clinical actions using a combination of innovations and resources across the entire spectrum of biomedical informatics approaches [2]. The challenge for TBI is the availability of both comprehensive knowledge based on genes and the corresponding tools that allow their analysis and exploitation. Traditionally, biological researchers usually study one or only a few genes at a time, but in recent years high throughput technologies such as gene expression microarrays, protein mass-spectrometry and next-generation DNA and RNA sequencing have emerged that allow the simultaneous measurement of changes on a genome-wide scale. These technologies usually result in large lists of interesting genes, but meaningful biological interpretation remains a major challenge. Over the last decade, enrichment analysis has become standard practice in the analysis of such gene lists, enabling systematic assessment of the likelihood of differential representation of defined groups of genes compared to suitably annotated background knowledge. The success of such analyses are highly dependent on the availability and quality of the gene annotation data. For many years, genes were annotated by different experts using inconsistent, non-standard terminologies. Large amounts of variation and duplication in these unstructured annotation sets, made them unsuitable for principled quantitative analysis. More recently, a lot of effort has been put into the development and use of structured, domain specific vocabularies to annotate genes. The Gene Ontology is one of the most successful examples of this where genes are annotated with terms from three main clades; biological process, molecular function and cellular component. However, there are many other established and emerging ontologies to aid biological data interpretation, but are rarely used. For the same reason, many bioinformatic tools only support analysis analysis using the Gene Ontology. The lack of annotation coverage and the support for them in existing analytical tools to aid biological interpretation of data has become a major limitation to their utility and uptake. Thus, automatic approaches are needed to facilitate the transformation of unstructured data to unlock the potential of all ontologies, with corresponding bioinformatics tools to support their interpretation. Approaches In this thesis, firstly, similar to the approach in [3,4], I propose a series of computational approaches implemented in a new tool OntoSuite-Miner to address the ontology based gene association data integration challenge. This approach uses NLP based text mining methods for ontology based biomedical text mining. What differentiates my approach from other approaches is that I integrate two of the most wildly used NLP modules into the framework, not only increasing the confidence of the text mining results, but also providing an annotation score for each mapping, based on the number of pieces of evidence in the literature and the number of NLP modules that agreed with the mapping. Since heterogeneous data is important in understanding human disease, the approach was designed to be generic, thus the ontology based annotation generation can be applied to different sources and can be repeated with different ontologies. Secondly, in respect of the second challenge proposed by TBI, to increase the statistical power of the annotation enrichment analysis, I propose OntoSuite-Analytics, which integrates a collection of enrichment analysis methods into a unified open-source software package named topOnto, in the statistical programming language R. The package supports enrichment analysis across multiple ontologies with a set of implemented statistical/topological algorithms, allowing the comparison of enrichment results across multiple ontologies and between different algorithms. Results The methodologies described above were implemented and a Human Disease Ontology (HDO) based gene annotation database was generated by mining three publicly available database, OMIM, GeneRIF and Ensembl variation. With the availability of the HDO annotation and the corresponding ontology enrichment analysis tools in topOnto, I profiled 277 gene classes with human diseases and generated ‘disease environments’ for 1310 human diseases. The exploration of the disease profiles and disease environment provides an overview of known disease knowledge and provides new insights into disease mechanisms. The integration of multiple ontologies into a disease context demonstrates how ‘orthogonal’ ontologies can lead to biological insight that would have been missed by more traditional single ontology analysis

    Combined population dynamics and entropy modelling supports patient stratification in chronic myeloid leukemia

    Get PDF
    Modelling the parameters of multistep carcinogenesis is key for a better understanding of cancer progression, biomarker identification and the design of individualized therapies. Using chronic myeloid leukemia (CML) as a paradigm for hierarchical disease evolution we show that combined population dynamic modelling and CML patient biopsy genomic analysis enables patient stratification at unprecedented resolution. Linking CD34+ similarity as a disease progression marker to patientderived gene expression entropy separated established CML progression stages and uncovered additional heterogeneity within disease stages. Importantly, our patient data informed model enables quantitative approximation of individual patients’ disease history within chronic phase (CP) and significantly separates “early” from “late” CP. Our findings provide a novel rationale for personalized and genome-informed disease progression risk assessment that is independent and complementary to conventional measures of CML disease burden and prognosis

    Cerebellum Transcriptome of Mice Bred for High Voluntary Activity Offers Insights into Locomotor Control and Reward-Dependent Behaviors.

    Get PDF
    The role of the cerebellum in motivation and addictive behaviors is less understood than that in control and coordination of movements. High running can be a self-rewarding behavior exhibiting addictive properties. Changes in the cerebellum transcriptional networks of mice from a line selectively bred for High voluntary running (H) were profiled relative to an unselected Control (C) line. The environmental modulation of these changes was assessed both in activity environments corresponding to 7 days of Free (F) access to running wheel and to Blocked (B) access on day 7. Overall, 457 genes exhibited a significant (FDR-adjusted P-value < 0.05) genotype-by-environment interaction effect, indicating that activity genotype differences in gene expression depend on environmental access to running. Among these genes, network analysis highlighted 6 genes (Nrgn, Drd2, Rxrg, Gda, Adora2a, and Rab40b) connected by their products that displayed opposite expression patterns in the activity genotype contrast within the B and F environments. The comparison of network expression topologies suggests that selection for high voluntary running is linked to a predominant dysregulation of hub genes in the F environment that enables running whereas a dysregulation of ancillary genes is favored in the B environment that blocks running. Genes associated with locomotor regulation, signaling pathways, reward-processing, goal-focused, and reward-dependent behaviors exhibited significant genotype-by-environment interaction (e.g. Pak6, Adora2a, Drd2, and Arhgap8). Neuropeptide genes including Adcyap1, Cck, Sst, Vgf, Npy, Nts, Penk, and Tac2 and related receptor genes also exhibited significant genotype-by-environment interaction. The majority of the 183 differentially expressed genes between activity genotypes (e.g. Drd1) were under-expressed in C relative to H genotypes and were also under-expressed in B relative to F environments. Our findings indicate that the high voluntary running mouse line studied is a helpful model for understanding the molecular mechanisms in the cerebellum that influence locomotor control and reward-dependent behaviors
    • 

    corecore