Search CORE

14 research outputs found

Hidden Markov Models for Gene Sequence Classification: Classifying the VSG genes in the Trypanosoma brucei Genome

Author: Alvarez-Valin Fernando
Basterrech Sebastián
Guerberoff Gustavo
Mesa Andrea
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 07/08/2015
Field of study

The article presents an application of Hidden Markov Models (HMMs) for pattern recognition on genome sequences. We apply HMM for identifying genes encoding the Variant Surface Glycoprotein (VSG) in the genomes of Trypanosoma brucei (T. brucei) and other African trypanosomes. These are parasitic protozoa causative agents of sleeping sickness and several diseases in domestic and wild animals. These parasites have a peculiar strategy to evade the host's immune system that consists in periodically changing their predominant cellular surface protein (VSG). The motivation for using patterns recognition methods to identify these genes, instead of traditional homology based ones, is that the levels of sequence identity (amino acid and DNA sequence) amongst these genes is often below of what is considered reliable in these methods. Among pattern recognition approaches, HMM are particularly suitable to tackle this problem because they can handle more naturally the determination of gene edges. We evaluate the performance of the model using different number of states in the Markov model, as well as several performance metrics. The model is applied using public genomic data. Our empirical results show that the VSG genes on T. brucei can be safely identified (high sensitivity and low rate of false positives) using HMM.Comment: Accepted article in July, 2015 in Pattern Analysis and Applications, Springer. The article contains 23 pages, 4 figures, 8 tables and 51 reference

arXiv.org e-Print Archive

Crossref

DSpace at VSB Technical University of Ostrava

In silico prediction of non-coding RNAs using supervised learning and feature ranking methods

Author: Griesmer Stephen J.
Publication venue: Digital Commons @ NJIT
Publication date: 31/01/2010
Field of study

This thesis presents a novel method, RNAMultifold, for development of a non-coding RNA (ncRNA) classification model based on features derived from folding the consensus sequence of multiple sequence alignments using different folding programs: RNAalifold, CentroidFold, and RSpredict. The method ranks these folding features according to a Class Separation Measure (CSM) that quantifies the ability of the features to differentiate between samples from positive and negative test sets. The set of top-ranked features is then used to construct classification models: Naive Bayes, Fisher Linear Discriminant, and Support Vector Machine (SVM). These models are compared to the performance of the same models with a baseline feature set and with an existing classification tool, RNAz. The Support Vector Machine classification model with a radial basis function kernel, using the top 11 ranked features, is shown to be more sensitive than other models, including another ncRNA prediction program, RNAz, across all specificity values for the RNA families under study. In addition, the target feature set outperforms the baseline feature set of z score and structure conservation index across all classification methods, with the exception of Fisher Linear Discriminant. The RNAMultifold method is then used to search the genome of a Trypanosome species (Trypanosoma brucei) for novel ncRNAs. The results of this search are compared with known ncRNAs and with results from RNAz

Digital Commons @ New Jersey Institute of Technology (NJIT)

Kinetoplastid Phylogenomics and Evolution

Author
Publication venue: 'MDPI AG'
Publication date: 24/02/2022
Field of study

This Special Issue, Kinetoplastid Phylogenomics and Evolution, unites a series of research and review papers related to kinetoplastid parasites. The diverse topics represented in this collection display a variety of scientific questions and methodological approaches currently used to study these fascinating organisms

Directory of Open Access Books (DOAB)

Roles of R-loops in the Trypanosoma brucei genome and antigenic variation

Author: Briggs Emma Marie
Publication venue
Publication date: 01/01/2018
Field of study

The genome of the eukaryotic parasite Trypanosoma brucei is both dynamic and unconventional in several aspects. In comparison with other eukaryotic genomes, where the majority of protein coding genes are associated with their own transcriptional promoters, T. brucei transcribes almost all protein-coding genes polycistronically. Transcription initiates from broad regions that lack defined promoter sequences and RNA Polymerase II then traverses up to hundreds of genes, generating a pre-mRNA that then requires trans-splicing and polyadenylation to generate mature mRNAs. Termination of transcription, via virtually unknown processes, occurs where two multigene transcription units converges or, in some cases, adjacent to a downstream transcription initiation site. RNA Polymerase II transcribes the majority of protein-coding genes in this manner, negating any differential gene expression via transcriptional control. A further unusual aspect of the genome is the dedication of as much as a third of the coding capacity to elements of antigenic variation. When infecting the mammalian host, parasites express a dense protein coat of variant surface glycoprotein (VSG). In order to evade host immune elements, T. brucei switches expression to antigenically distinct VSGs, employing a repertoire of ~2,000 genes. Both transcriptional and recombination-based strategies enable the parasite to either switch transcription between ~15 expression sites, each housing a distinct VSG, or relocate VSG sequence from silent gene arrays into an active VSG expression site. Although multiple factors have been found to regulate these processes, the events which trigger a VSG switch by either pathway are unclear. R-loops are three stranded structures containing an RNA-DNA hybrid and displaced single-stranded DNA. Although potentially deleterious to genome integrity, R-loops have been linked to transcription initiation and termination, DNA replication and recombination events. In this study, the potential for R-loop involvement in these fundamental genome functions of T. brucei was investigated. Firstly, Ribonuclease (RNase) H enzymes, which resolve the RNA-DNA hybrid portion of R-loops, were characterised, revealing T. brucei expresses potentially three distinct catalytic enzymes, two functioning in the nuclear genome and one in the kinetoplast(mitochondrial) genome. Nuclear RNase H activity was depleted by null mutation or RNAi mediated knockdown of the nuclear RNase H enzymes, showing that while one RNase H, TbRH1, is non-essential, loss of the other, TbRH2, caused several growth and genome integrity defects. As it was hypothesised to increased levels of RNA- DNA hybrids of the genome, RNA-DNA hybrids were mapped in wild type parasites and those lacking RNases H using a specific antiserum, S9.6. This mapping identified the conserved formation of R-loops at centromeres, retrotransposon-associated genes, rRNA and tRNA genes. R-loop enrichment was also uncovered at RNA Polymerase II transcription start sites, as documented in mammalian genomes. DNA damage was specifically increased at these sites after TbRH2 depletion, indicating efficient resolution of these transcription initiation-associated R-loops is critical for genome maintenance. In contrast, R-loops were not associated with DNA replication or transcription termination suggesting RNA-DNA hybrids are not involved in these processes in T. brucei. The most abundant sites of R-loop enrichment were found to be at the nucleosome depleted regions located between the coding regions of polycistronically transcribed genes and are associated with polyadenylation and trans-splicing, highlighting a novel correlation of R-loops with pre-mRNA processing. Lastly, R-loops were mapped to VSG expression sites where their abundance increased after ablation of RNase H activity, an effect that was associated with both increased DNA damage and VSG switching, uncovering an R-loop-driven mechanism of antigenic variation

Glasgow Theses Service

Bioinformatics

Author
Publication venue: 'IntechOpen'
Publication date: 20/04/2021
Field of study

This book is divided into different research areas relevant in Bioinformatics such as biological networks, next generation sequencing, high performance computing, molecular modeling, structural bioinformatics, molecular modeling and intelligent data analysis. Each book section introduces the basic concepts and then explains its application to problems of great relevance, so both novice and expert readers can benefit from the information and research works presented here

Directory of Open Access Books (DOAB)

Recommended from our members

INTEGRATING CHEMICAL, BIOLOGICAL AND PHYLOGENETIC SPACES OF AFRICAN NATURAL PRODUCTS TO UNDERSTAND THEIR THERAPEUTIC ACTIVITY

Author: Baldo Fatima Magdi Hamza
Publication venue: University of Cambridge
Publication date: 18/02/2019
Field of study

INTEGRATING CHEMICAL, BIOLOGICAL AND PHYLOGENETIC SPACES OF AFRICAN NATURAL PRODUCTS TO UNDERSTAND THEIR THERAPEUTIC ACTIVITY Fatima Magdi Hamza Baldo This research aims to utilise ligand-based target prediction to (i) understand the mechanism of action of African natural products (ANPs), (ii) help identify patterns of phylogenetic use in African traditional medicine and (iii) elucidate the mechanism of action of phenotypically active small molecules and natural products with anti-trypanosomal activity. In Chapter 2 the objective was to utilise ligand-based target prediction to understand the mechanism of action of natural products (NPs) from African medicinal plants used against cancer. The Random Forest classifier used in this work compares the similarity of the input compounds from the natural product dataset with compound-target combinations in the training set. The more similar they are in structure, the more likely they are to modulate the same target. Natural products from plants used against cancer in Africa were predicted to modulate targets and pathways directly associated with the disease, thus understanding their mechanism of action e.g. “flap endonuclease 1” and “Mcl-1”. The “Keap1-Nrf2 Pathway” and “apoptosis modulation by HSP70”, two pathways previously linked to cancer (which are not currently targeted by marketed drugs, but have been of increasing interest in recent years) were predicted to be modulated by ANPs. In Chapter 3, we aimed to identify phylogenetic patterns in medicinal plant use and the role this plays in predicting medicinal activity. We combined chemical, predicted target and phylogenetic information of the natural products to identify patterns of use for plant families containing plant species used against cancer in African, Malay and Indian (Ayurveda) traditional medicine. Plant families that are close phylogenetically were found to produce similar natural products that act on similar targets regardless of their origin. Additionally, phylogenetic patterns were identified for African traditional plant families with medicinal species used against cancer, malaria and human African trypanosomiasis (HAT). We identified plant families that have more medicinal species than would statistically be expected by chance and rationalised this by linking their activity to their unique phyto-chemistry e.g. the napthyl-isoquinoline alkaloids, uniquely produced by Acistrocladaceae and Dioncophyllaceae, are responsible for anti-malarial and anti-trypanosome activity. In Chapter 4, information from target prediction and experimentally validated targets was combined with orthologue data to predict targets of phenotypically active small molecules and natural products screened against Trypanosoma brucei. The predicted targets were prioritised based on their essentiality for the survival of the T. brucei parasite. We predicted orthologues of targets that are essential for the survival of the trypanosome e.g. glycogen synthase kinase 3 (GSK3) and rhodesain. We also identified the biological processes predicted to be perturbed by the compounds e.g. “glycolysis”, “cell cycle”, “regulation of symbiosis, encompassing mutualism through parasitism” and “modulation of development of symbiont involved in interaction with host”. In conclusion, in silico target prediction can be used to predict protein targets of natural products to understand their molecular mechanism of action. Phylogenetic information and phytochemical information of medicinal plants can be integrated to identify plant families with more medicinal species than would be expected by chance

Apollo (Cambridge)

The impact of the International Livestock Research Institute

Author: Grace Delia
McIntire John M.
Publication venue: ILRI
Publication date: 10/08/2020
Field of study

Providing the first evidence-based global estimates of the many scientific, economic, policy, and capacity development impacts of livestock research in and for developing countries, this volume is an indispensable guide and reference for veterinarians, animal and forage scientists, and anyone working for the equitable and sustainable development of the world's poorer agricultural economies. Livestock is one of the fastest growing agricultural sectors, with most growth occurring in developing countries. For more than four and a half decades one global centre has been mandated to conduct research on leveraging the benefits and mitigating the costs of livestock production in poor countries. This book focuses on the achievements, failures and impacts of the International Livestock Research Institute (ILRI) and its predecessors, the International Livestock Centre for Africa (ILCA) and the International Laboratory for Research on Animal Diseases (ILRAD). The scientific and economic impacts of tropical livestock research detailed in this work reveal valuable lessons for reducing world hunger, poverty and environmental degradation. Describing the impacts of smallholder livestock systems on the global environment, the book also covers animal genetics, production, health and disease control, and livestock-related land management, public policy and economics, all with useful pointers for future livestock-for-development research

CGSpace

Rapid analysis of pharmacology for infectious diseases.

Author: Carruthers Ian Michael
Publication venue
Publication date: 01/01/2013
Field of study

University of Dundee Online Publications

A Comparison of Mitochondrial Heat Shock Protein 70 and Hsp70 Escort Protein 1 Orthologues from Trypanosoma brucei and Homo sapiens

Author: Hand Francis Bryan
Publication venue: Faculty of Science, Biotechnology Innovation Centre
Publication date: 29/03/2023
Field of study

The causative agent of African trypanosomiasis, Trypanosoma brucei (T. brucei), has an expanded retinue of specialized heat shock proteins, which have been identified as crucial to the progression of the disease. These play a central role in disease progression and transmission through their involvement in cell-cycle pathways which bring about cell-cycle arrest and differentiation. Hsp70 proteins are essential for the maintenance of proteostasis in the cell. Mitochondrial Hsp70 (mtHsp70) is a highly conserved molecular chaperone required for both the translocation of nuclear encoded proteins across the two mitochondrial membranes and the subsequent folding of proteins in the matrix. The T. brucei genome encodes three copies of mtHsp70 which are 100% identical. MtHsp70 self-aggregates, a property unique to this isoform, and an Hsp70 escort protein (Hep1) is required to maintain the molecular chaperone in a soluble, functional state. This study aimed to compare the solubilizing interaction of Hep1 from T. brucei and Homo sapiens (H. sapien). The recently introduced Alphafold program was used to analyze the structures of mtHsp70 and Hep1 proteins and allowed observations of structures unavailable to other modelling techniques. The GVFEV motif found in the ATPase domain of mtHsp70s interacted with the linker region, resulting in aggregation, the Alphafold models produced indicated that the replacement of the lysine (K) residue within the KTFEV motif of DnaK (prokaryotic Hsp70) with Glycine (G), may abrogate bond formation between the motif and a region between lobe I and II of the ATPase domain. This may facilitate the aggregation reaction of mtHsp70 orthologues and provides a residue of interest for future studies. Both TbHep1 and HsHep1 reduced the thermal aggregation of TbmtHsp70 and mortalin (H. sapien mtHsp70) respectively, however, TbHep1 was ~ 15 % less effective than HsHep1 at higher concentrations (4 uM). TbHep1 itself appeared to be aggregation-prone when under conditions of thermal stress, Alphafold models suggest this may be due to an N-terminal α- helical structure not present in HsHep1. These results indicate that TbHep1 is functionally similar to HsHep1, however, the orthologue may operate in a unique manner which requires further investigation.Thesis (MSc) -- Faculty of Science, Biotechnology Innovation Centre, 202

Rhodes Repository (SEALS)

Genetic diversity in Trypanosoma cruzi: marker development and applications; natural population structures, and genetic exchange mechanisms

Author: Messenger LA
Publication venue
Publication date
Field of study

Chagas disease remains the most important parasitic infection in Latin America. The aetiological agent, Trypanosoma cruzi (Kinetoplastida: Trypanosomatidae), is a complex vector-borne zoonosis transmitted in the faeces of hematophagous triatomine bugs (Hemiptera: Reduviidae: Triatominae), and maintained by mammalian reservoir hosts ranging from the southern United States to Argentinean Patagonia. In the absence of chemotherapy, infection is life-long and can lead to a spectrum of pathological sequelae ranging from subclinical to lethal cardiac and/or gastrointestinal complications in up to 30% of patients. T. cruzi displays remarkable genetic diversity, which has long been suspected to contribute to the considerable variation in clinical symptoms observed between endemic regions. Currently, isolates of T. cruzi can be assigned to a minimum of six stable genetic lineages or discrete typing units (DTUs) (TcI-TcVI), which are broadly associated with disparate ecologies, transmission cycles and geographical distributions. The principal mode of reproduction among T. cruzi strains is the subject of an intense, decades-old debate. Despite the existence of two recent natural hybrid lineages (TcV and TcVI), which resemble meiotic F1 progeny, a pervasive view is that recombination has been restrained at an evolutionary scale and is of little epidemiological relevance to contemporary parasite populations. The aim of this PhD project was to investigate T. cruzi genetic diversity through significant development of phylogenetic markers and their application to the characterization of natural parasite population structures and genetic exchange mechanisms. Multiple, single-copy, chromosomally-independent, nuclear housekeeping genes were assessed initially for their ability to allocate isolates to DTU-level, to facilitate higher resolution intra-lineage analyses and finally for their inclusion alongside additional targets in a standardized T. cruzi multilocus sequence typing (nMLST) scheme. For the immediate future, nuclear MLST, using a panel of four to seven nuclear loci, is a robust, reproducible and highly discriminatory method that has potential to become the new gold standard for T. cruzi DTU assignment. To investigate natural parasite population structures and uncover evidence of genetic exchange, a high resolution mitochondrial MLST (mtMLST) scheme, based on ten gene fragments, was developed and evaluated against current nuclear markers (multilocus microsatellite typing; MLMT) using isolates belonging to the oldest and most widely distributed lineage (TcI). Observations of gross nuclear-mitochondrial phylogenetic incongruence indicate that recombination is ongoing, geographically widespread and continues to influence natural populations, challenging the traditional paradigm of clonality in T. cruzi. Application of this combined nuclear-mitochondrial methodology to intensively sampled, minimally-subdivided TcI populations revealed extensive mitochondrial introgression within a disease focus in North-East Colombia as well as among arboreal transmission cycles in Bolivia. Failure to detect any reciprocal nuclear hybridization among recombinant strains ! 4 may be indicative of alternate, cryptic mating strategies in T. cruzi, which are challenging to reconcile with both in vitro parasexual mechanisms of genetic exchange described, and patterns of Mendelian allele inheritance among natural hybrid DTUs. High resolution genotyping of TcI populations was also undertaken to explore the interaction between parasite genetic heterogeneity and ecological biodiversity, exposing the significant impact human activity has had on T. cruzi evolution. Reduced genetic diversity, accelerated parasite dissemination between densely populated areas and mitochondrial gene flow between domestic and sylvatic populations, suggests humans may have played a crucial role in T. cruzi dispersal across the Bolivian highlands. Parallel reductions in genetic diversity were observed among isolates from the Brazilian Atlantic Forest, attributable to ongoing anthropogenic habitat fragmentation. By comparison domestic TcI isolates (TcIDOM) are divergent from their sylvatic counterparts, but also genetically homogeneous, and likely to have originated in North/Central America before distribution southwards. Molecular dating of Colombian TcIDOM clones confirmed that this clade emerged 23,000 ± 12,000 years, coinciding with the earliest human migration into South America. Lastly, Illumina amplicon deep sequencing markers were developed to explore the interaction between parasite multiclonality and clinical status of chronic Chagas disease. An unprecedented level of intra-host genetic diversity was detected, highlighting putative diversifying selection affecting antigenic surface proteases, which may facilitate survival in the mammalian host. In lieu of comparative genomics of representative T. cruzi field isolates, not yet a reality, as is the case with other more experimentally-tractable trypanosomatids, presented herein are some of the highest resolution genotyping techniques developed in T. cruzi to date, which have the potential to expand our current understanding of parasite genetic diversity and its relevance to clinical outcome of Chagas disease

LSHTM Research Online