18 research outputs found

    Improved strategy for the curation and classification of kinases, with broad applicability to other eukaryotic protein groups

    Get PDF
    Despite the substantial amount of genomic and transcriptomic data available for a wide range of eukaryotic organisms, most genomes are still in a draft state and can have inaccurate gene predictions. To gain a sound understanding of the biology of an organism, it is crucial that inferred protein sequences are accurately identified and annotated. However, this can be challenging to achieve, particularly for organisms such as parasitic worms (helminths), as most gene prediction approaches do not account for substantial phylogenetic divergence from model organisms, such as Caenorhabditis elegans and Drosophila melanogaster, whose genomes are well-curated. In this paper, we describe a bioinformatic strategy for the curation of gene families and subsequent annotation of encoded proteins. This strategy relies on pairwise gene curation between at least two closely related species using genomic and transcriptomic data sets, and is built on recent work on kinase complements of parasitic worms. Here, we discuss salient technical aspects of this strategy and its implications for the curation of protein families more generally

    First record of a tandem-repeat region within the mitochondrial genome ofClonorchis sinensisusing a long-read sequencing approach

    Get PDF
    BACKGROUND: Mitochondrial genomes provide useful genetic markers for systematic and population genetic studies of parasitic helminths. Although many such genome sequences have been published and deposited in public databases, there is evidence that some of them are incomplete relating to an inability of conventional techniques to reliably sequence non-coding (repetitive) regions. In the present study, we characterise the complete mitochondrial genome-including the long, non-coding region-of the carcinogenic Chinese liver fluke, Clonorchis sinensis, using long-read sequencing. METHODS: The mitochondrial genome was sequenced from total high molecular-weight genomic DNA isolated from a pool of 100 adult worms of C. sinensis using the MinION sequencing platform (Oxford Nanopore Technologies), and assembled and annotated using an informatic approach. RESULTS: From > 93,500 long-reads, we assembled a 18,304 bp-mitochondrial genome for C. sinensis. Within this genome we identified a novel non-coding region of 4,549 bp containing six tandem-repetitive units of 719-809 bp each. Given that genomic DNA from pooled worms was used for sequencing, some variability in length/sequence in this tandem-repetitive region was detectable, reflecting population variation. CONCLUSIONS: For C. sinensis, we report the complete mitochondrial genome, which includes a long (> 4.5 kb) tandem-repetitive region. The discovery of this non-coding region using a nanopore-sequencing/informatic approach now paves the way to investigating the nature and extent of length/sequence variation in this region within and among individual worms, both within and among C. sinensis populations, and to exploring whether this region has a functional role in the regulation of replication and transcription, akin to the mitochondrial control region in mammals. Although applied to C. sinensis, the technological approach established here should be broadly applicable to characterise complex tandem-repetitive or homo-polymeric regions in the mitochondrial genomes of a wide range of taxa

    Expanded complement of Niemann-Pick type C2-like protein genes in Clonorchis sinensis suggests functions beyond sterol binding and transport

    Get PDF
    BACKGROUND: The parasitic flatworm Clonorchis sinensis inhabits the biliary tree of humans and other piscivorous mammals. This parasite can survive and thrive in the bile duct, despite exposure to bile constituents and host immune attack. Although the precise biological mechanisms underlying this adaptation are unknown, previous work indicated that Niemann-pick type C2 (NPC2)-like sterol-binding proteins might be integral in the host-parasite interplay. Expansions of this family in some invertebrates, such as arthropods, have shown functional diversification, including novel forms of chemoreception. Thus, here we curated the NPC2-like protein gene complement in C. sinensis, and predicted their conserved and/or divergent functional roles. METHODS: We used an established comparative genomic-bioinformatic approach to curate NPC2-like proteins encoded in published genomes of Korean and Chinese isolates of C. sinensis. Protein sequence and structural homology, presence of conserved domains and phylogeny were used to group and functionally classify NPC2-like proteins. Furthermore, transcription levels of NPC2-like protein-encoding genes were explored in different developmental stages and tissues. RESULTS: Totals of 35 and 32 C. sinensis NPC2-like proteins were predicted to be encoded in the genomes of the Korean and Chinese isolates, respectively. Overall, these proteins had low sequence homology and high variability of sequence alignment coverage when compared with curated NPC2s. Most C. sinensis proteins were predicted to retain a conserved ML domain and a conserved fold conformation, with a large cavity within the protein. Only one protein sequence retained the conserved amino acid residues required in bovine NPC2 to bind cholesterol. Non-canonical C. sinensis NPC2-like protein-coding domains clustered into four distinct phylogenetic groups with members of a group frequently encoded on the same genome scaffolds. Interestingly, NPC2-like protein-encoding genes were predicted to be variably transcribed in different developmental stages and adult tissues, with most being transcribed in the metacercarial stage. CONCLUSIONS: The results of the present investigation confirms an expansion of NPC2-like proteins in C. sinensis, suggesting a diverse array of functions beyond sterol binding and transport. Functional explorations of this protein family should elucidate the mechanisms enabling the establishment and survival of C. sinensis and related flukes in the biliary systems of mammalian hosts

    CAP protein superfamily members in Toxocara canis

    Get PDF
    BACKGROUND: Proteins of the cysteine-rich secretory proteins, antigen 5 and pathogenesis-related 1 (CAP) superfamily are recognized or proposed to play roles in parasite development and reproduction, and in modulating host immune attack and infection processes. However, little is known about these proteins for most parasites. RESULTS: In the present study, we explored CAP proteins of Toxocara canis, a socioeconomically important zoonotic roundworm. To do this, we mined and curated transcriptomic and genomic data, predicted and curated full-length protein sequences (n = 28), conducted analyses of these data and studied the transcription of respective genes in different developmental stages of T. canis. In addition, based on information available for Caenorhabditis elegans, we inferred that selected genes (including lon-1, vap-1, vap-2, scl-1, scl-8 and scl-11 orthologs) of T. canis and their interaction partners likely play central roles in this parasite's development and/or reproduction via TGF-beta and/or insulin-like signaling pathways, or via host interactions. CONCLUSION: In conclusion, this study could provide a foundation to guide future studies of CAP proteins of T. canis and related parasites, and might assist in finding new interventions against diseases caused by these parasites

    Nanopore Sequencing Resolves Elusive Long Tandem-Repeat Regions in Mitochondrial Genomes

    Get PDF
    Long non-coding, tandem-repetitive regions in mitochondrial (mt) genomes of many metazoans have been notoriously difficult to characterise accurately using conventional sequencing methods. Here, we show how the use of a third-generation (long-read) sequencing and informatic approach can overcome this problem. We employed Oxford Nanopore technology to sequence genomic DNAs from a pool of adult worms of the carcinogenic parasite, Schistosoma haematobium, and used an informatic workflow to define the complete mt non-coding region(s). Using long-read data of high coverage, we defined six dominant mt genomes of 33.4 kb to 22.6 kb. Although no variation was detected in the order or lengths of the protein-coding genes, there was marked length (18.5 kb to 7.6 kb) and structural variation in the non-coding region, raising questions about the evolution and function of what might be a control region that regulates mt transcription and/or replication. The discovery here of the largest tandem-repetitive, non-coding region (18.5 kb) in a metazoan organism also raises a question about the completeness of some of the mt genomes of animals reported to date, and stimulates further explorations using a Nanopore-informatic workflow

    Major SCP/TAPS protein expansion in Lucilia cuprina is associated with novel tandem array organisation and domain architecture

    Get PDF
    Background Larvae of the Australian sheep blowfly, Lucilia cuprina, parasitise sheep by feeding on skin excretions, dermal tissue and blood, causing severe damage known as flystrike or myiasis. Recent advances in -omic technologies and bioinformatic data analyses have led to a greater understanding of blowfly biology and should allow the identification of protein families involved in host-parasite interactions and disease. Current literature suggests that proteins of the SCP (Sperm-Coating Protein)/TAPS (Tpx-1/Ag5/PR-1/Sc7) (SCP/TAPS) superfamily play key roles in immune modulation, cross-talk between parasite and host as well as developmental and reproductive processes in parasites. Methods Here, we employed a bioinformatics workflow to curate the SCP/TAPS protein gene family in L. cuprina. Protein sequence, the presence and number of conserved CAP-domains and phylogeny were used to group identified SCP/TAPS proteins; these were compared to those found in Drosophila melanogaster to make functional predictions. In addition, transcription levels of SCP/TAPS protein-encoding genes were explored in different developmental stages. Results A total of 27 genes were identified as belonging to the SCP/TAPS gene family: encoding 26 single-domain proteins each with a single CAP domain and a solitary double-domain protein containing two conserved cysteine-rich secretory protein/antigen 5/pathogenesis related-1 (CAP) domains. Surprisingly, 16 SCP/TAPS predicted proteins formed an extended tandem array spanning a 53 kb region of one genomic region, which was confirmed by MinION long-read sequencing. RNA-seq data indicated that these 16 genes are highly transcribed in all developmental stages (excluding the embryo). Conclusions Future work should assess the potential of selected SCP/TAPS proteins as novel targets for the control of L. cuprina and related parasitic flies of major socioeconomic importanc

    The small RNA complement of adult Schistosoma haematobium

    Get PDF
    BACKGROUND: Blood flukes of the genus Schistosoma cause schistosomiasis-a neglected tropical disease (NTD) that affects more than 200 million people worldwide. Studies of schistosome genomes have improved our understanding of the molecular biology of flatworms, but most of them have focused largely on protein-coding genes. Small non-coding RNAs (sncRNAs) have been explored in selected schistosome species and are suggested to play essential roles in the post-transcriptional regulation of genes, and in modulating flatworm-host interactions. However, genome-wide small RNA data are currently lacking for key schistosomes including Schistosoma haematobium-the causative agent of urogenital schistosomiasis of humans. METHODOLOGY: MicroRNAs (miRNAs) and other sncRNAs of male and female adults of S. haematobium and small RNA transcription levels were explored by deep sequencing, genome mapping and detailed bioinformatic analyses. PRINCIPAL FINDINGS: In total, 89 transcribed miRNAs were identified in S. haematobium-a similar complement to those reported for the congeners S. mansoni and S. japonicum. Of these miRNAs, 34 were novel, with no homologs in other schistosomes. Most miRNAs (n = 64) exhibited sex-biased transcription, suggestive of roles in sexual differentiation, pairing of adult worms and reproductive processes. Of the sncRNAs that were not miRNAs, some related to the spliceosome (n = 21), biogenesis of other RNAs (n = 3) or ribozyme functions (n = 16), whereas most others (n = 3798) were novel ('orphans') with unknown functions. CONCLUSIONS: This study provides the first genome-wide sncRNA resource for S. haematobium, extending earlier studies of schistosomes. The present work should facilitate the future curation and experimental validation of sncRNA functions in schistosomes to enhance our understanding of post-transcriptional gene regulation and of the roles that sncRNAs play in schistosome reproduction, development and parasite-host cross-talk

    Flatworms have lost the right open reading frame kinase 3 gene during evolution

    Get PDF
    All multicellular organisms studied to date have three right open reading frame kinase genes (designated riok-1, riok-2 and riok-3). Current evidence indicates that riok-1 and riok-2 have essential roles in ribosome biosynthesis, and that the riok-3 gene assists this process. In the present study, we conducted a detailed bioinformatic analysis of the riok gene family in 25 parasitic flatworms (platyhelminths) for which extensive genomic and transcriptomic data sets are available. We found that none of the flatworms studied have a riok-3 gene, which is unprecedented for multicellular organisms. We propose that, unlike in other eukaryotes, the loss of RIOK-3 from flatworms does not result in an evolutionary disadvantage due to the unique biology and physiology of this phylum. We show that the loss of RIOK-3 coincides with a loss of particular proteins associated with essential cellular pathways linked to cell growth and apoptosis. These findings indicate multiple, key regulatory functions of RIOK-3 in other metazoan species. Taking advantage of a known partial crystal structure of human RIOK-1, molecular modelling revealed variability in nucleotide binding sites between flatworm and human RIOK proteins

    Screening of the 'Stasis Box' identifies two kinase inhibitors under pharmaceutical development with activity against Haemonchus contortus

    Get PDF
    BACKGROUND: In partnership with the Medicines for Malaria Venture (MMV), we screened a collection ('Stasis Box') of 400 compounds (which have been in clinical development but have not been approved for illnesses other than neglected infectious diseases) for inhibitory activity against Haemonchus contortus, in order to attempt to repurpose some of the compounds to parasitic nematodes. METHODS: We assessed the inhibition of compounds on the motility and/or development of exsheathed third-stage (xL3s) and fourth-stage (L4) larvae of H. contortus using a whole-organism screening assay. RESULTS: In the primary screen, we identified compound MMV690767 (also known as SNS-032) that inhibited xL3 motility by ~70% at a concentration of 20 μM after 72 h as well as compound MMV079840 (also known as AG-1295), which induced a coiled xL3 phenotype, with ~50% inhibition on xL3 motility. Subsequently, we showed that SNS-032 (IC50 = 12.4 μM) and AG-1295 (IC50 = 9.92 ± 1.86 μM) had a similar potency to inhibit xL3 motility. Although neither SNS-032 nor AG-1295 had a detectable inhibitory activity on L4 motility, both compounds inhibited L4 development (IC50 values = 41.24 μM and 7.75 ± 0.94 μM for SNS-032 and AG-1295, respectively). The assessment of the two compounds for toxic effects on normal human breast epithelial (MCF10A) cells revealed that AG-1295 had limited cytotoxicity (IC50 > 100 μM), whereas SNS-032 was quite toxic to the epithelial cells (IC50 = 1.27 μM). CONCLUSIONS: Although the two kinase inhibitors, SNS-032 and AG-1295, had moderate inhibitory activity on the motility or development of xL3s or L4s of H. contortus in vitro, further work needs to be undertaken to chemically alter these entities to achieve the potency and selectivity required for them to become nematocidal or nematostatic candidates
    corecore