13,918 research outputs found
Transcriptome-wide analysis reveals different categories of response to a standardised immune challenge in a wild rodent
Individuals vary in their immune response and, as a result, some are more susceptible to infectious disease than others. Little is known about the nature of this individual variation in natural populations, or which components of immune pathways are most responsible, but defining this underlying landscape of variation is an essential first step to understanding the drivers of this variation and, ultimately, predicting the outcome of infection. We describe transcriptome-wide variation in response to a standardised immune challenge in wild field voles. We find that markers can be categorised into a limited number of types. For the majority of markers, the response of an individual is dependent on its baseline expression level, with significant enrichment in this category for conventional immune pathways. Another, moderately sized, category contains markers for which the responses of different individuals are also variable but independent of their baseline expression levels. This category lacks any enrichment for conventional immune pathways. We further identify markers which display particularly high individual variability in response, and could be used as markers of immune response in larger studies. Our work shows how a standardised challenge performed on a natural population can reveal the patterns of natural variation in immune response
Semi-automated framework for the analytical use of gene-centric data with biological ontologies
Motivation Translational bioinformatics(TBI) has been defined as âthe development
and application of informatics methods that connect molecular entities to clinical entitiesâ
[1], which has emerged as a systems theory approach to bridge the huge wealth of
biomedical data into clinical actions using a combination of innovations and resources
across the entire spectrum of biomedical informatics approaches [2]. The challenge
for TBI is the availability of both comprehensive knowledge based on genes and the
corresponding tools that allow their analysis and exploitation.
Traditionally, biological researchers usually study one or only a few genes at a
time, but in recent years high throughput technologies such as gene expression microarrays,
protein mass-spectrometry and next-generation DNA and RNA sequencing
have emerged that allow the simultaneous measurement of changes on a genome-wide
scale. These technologies usually result in large lists of interesting genes, but meaningful
biological interpretation remains a major challenge. Over the last decade, enrichment
analysis has become standard practice in the analysis of such gene lists, enabling
systematic assessment of the likelihood of differential representation of defined groups
of genes compared to suitably annotated background knowledge. The success of such
analyses are highly dependent on the availability and quality of the gene annotation
data.
For many years, genes were annotated by different experts using inconsistent, non-standard
terminologies. Large amounts of variation and duplication in these unstructured
annotation sets, made them unsuitable for principled quantitative analysis. More
recently, a lot of effort has been put into the development and use of structured, domain
specific vocabularies to annotate genes. The Gene Ontology is one of the most successful
examples of this where genes are annotated with terms from three main clades;
biological process, molecular function and cellular component. However, there are
many other established and emerging ontologies to aid biological data interpretation,
but are rarely used. For the same reason, many bioinformatic tools only support analysis
analysis using the Gene Ontology.
The lack of annotation coverage and the support for them in existing analytical
tools to aid biological interpretation of data has become a major limitation to their utility
and uptake. Thus, automatic approaches are needed to facilitate the transformation
of unstructured data to unlock the potential of all ontologies, with corresponding bioinformatics
tools to support their interpretation.
Approaches In this thesis, firstly, similar to the approach in [3,4], I propose a series
of computational approaches implemented in a new tool OntoSuite-Miner to address
the ontology based gene association data integration challenge. This approach uses
NLP based text mining methods for ontology based biomedical text mining. What
differentiates my approach from other approaches is that I integrate two of the most
wildly used NLP modules into the framework, not only increasing the confidence of
the text mining results, but also providing an annotation score for each mapping, based
on the number of pieces of evidence in the literature and the number of NLP modules
that agreed with the mapping. Since heterogeneous data is important in understanding
human disease, the approach was designed to be generic, thus the ontology
based annotation generation can be applied to different sources and can be repeated
with different ontologies. Secondly, in respect of the second challenge proposed by
TBI, to increase the statistical power of the annotation enrichment analysis, I propose
OntoSuite-Analytics, which integrates a collection of enrichment analysis methods into
a unified open-source software package named topOnto, in the statistical programming
language R. The package supports enrichment analysis across multiple ontologies with
a set of implemented statistical/topological algorithms, allowing the comparison of enrichment
results across multiple ontologies and between different algorithms.
Results The methodologies described above were implemented and a Human Disease
Ontology (HDO) based gene annotation database was generated by mining three
publicly available database, OMIM, GeneRIF and Ensembl variation. With the availability
of the HDO annotation and the corresponding ontology enrichment analysis
tools in topOnto, I profiled 277 gene classes with human diseases and generated âdisease
environmentsâ for 1310 human diseases. The exploration of the disease profiles
and disease environment provides an overview of known disease knowledge and provides
new insights into disease mechanisms. The integration of multiple ontologies
into a disease context demonstrates how âorthogonalâ ontologies can lead to biological
insight that would have been missed by more traditional single ontology analysis
Combined population dynamics and entropy modelling supports patient stratification in chronic myeloid leukemia
Modelling the parameters of multistep carcinogenesis is key for a better understanding of cancer
progression, biomarker identification and the design of individualized therapies. Using chronic
myeloid leukemia (CML) as a paradigm for hierarchical disease evolution we show that combined
population dynamic modelling and CML patient biopsy genomic analysis enables patient stratification
at unprecedented resolution. Linking CD34+ similarity as a disease progression marker to patientderived
gene expression entropy separated established CML progression stages and uncovered
additional heterogeneity within disease stages. Importantly, our patient data informed model enables
quantitative approximation of individual patientsâ disease history within chronic phase (CP) and
significantly separates âearlyâ from âlateâ CP. Our findings provide a novel rationale for personalized
and genome-informed disease progression risk assessment that is independent and complementary to
conventional measures of CML disease burden and prognosis
Recommended from our members
A short survey of discourse representation models
With the advancement of technology and the wide adoption of ontologies as knowledge representation formats, in the last decade, a handful of models were proposed for the externalization of the rhetoric and argumentation captured within scientific publications. Conceptually, most of these models share a similar representation form of the scientific publication, i.e. as a series of interconnected elementary knowledge items. The main differences are given by the terminology used, the types of rhetorical and/or argumentation relations connecting the knowledge items and the foundational theories supporting these relations. This paper analyzes the state of the art and provides a concise comparative overview of the ïŹve most prominent discourse representation models, with the goal of sketching an uniïŹed model for discourse representation
Cerebellum Transcriptome of Mice Bred for High Voluntary Activity Offers Insights into Locomotor Control and Reward-Dependent Behaviors.
The role of the cerebellum in motivation and addictive behaviors is less understood than that in control and coordination of movements. High running can be a self-rewarding behavior exhibiting addictive properties. Changes in the cerebellum transcriptional networks of mice from a line selectively bred for High voluntary running (H) were profiled relative to an unselected Control (C) line. The environmental modulation of these changes was assessed both in activity environments corresponding to 7 days of Free (F) access to running wheel and to Blocked (B) access on day 7. Overall, 457 genes exhibited a significant (FDR-adjusted P-value < 0.05) genotype-by-environment interaction effect, indicating that activity genotype differences in gene expression depend on environmental access to running. Among these genes, network analysis highlighted 6 genes (Nrgn, Drd2, Rxrg, Gda, Adora2a, and Rab40b) connected by their products that displayed opposite expression patterns in the activity genotype contrast within the B and F environments. The comparison of network expression topologies suggests that selection for high voluntary running is linked to a predominant dysregulation of hub genes in the F environment that enables running whereas a dysregulation of ancillary genes is favored in the B environment that blocks running. Genes associated with locomotor regulation, signaling pathways, reward-processing, goal-focused, and reward-dependent behaviors exhibited significant genotype-by-environment interaction (e.g. Pak6, Adora2a, Drd2, and Arhgap8). Neuropeptide genes including Adcyap1, Cck, Sst, Vgf, Npy, Nts, Penk, and Tac2 and related receptor genes also exhibited significant genotype-by-environment interaction. The majority of the 183 differentially expressed genes between activity genotypes (e.g. Drd1) were under-expressed in C relative to H genotypes and were also under-expressed in B relative to F environments. Our findings indicate that the high voluntary running mouse line studied is a helpful model for understanding the molecular mechanisms in the cerebellum that influence locomotor control and reward-dependent behaviors
- âŠ