41 research outputs found

    Correction: Splice site identification using probabilistic parameters and SVM classification

    Get PDF
    BACKGROUND: Recent advances and automation in DNA sequencing technology has created a vast amount of DNA sequence data. This increasing growth of sequence data demands better and efficient analysis methods. Identifying genes in this newly accumulated data is an important issue in bioinformatics, and it requires the prediction of the complete gene structure. Accurate identification of splice sites in DNA sequences plays one of the central roles of gene structural prediction in eukaryotes. Effective detection of splice sites requires the knowledge of characteristics, dependencies, and relationship of nucleotides in the splice site surrounding region. A higher-order Markov model is generally regarded as a useful technique for modeling higher-order dependencies. However, their implementation requires estimating a large number of parameters, which is computationally expensive. RESULTS: The proposed method for splice site detection consists of two stages: a first order Markov model (MM1) is used in the first stage and a support vector machine (SVM) with polynomial kernel is used in the second stage. The MM1 serves as a pre-processing step for the SVM and takes DNA sequences as its input. It models the compositional features and dependencies of nucleotides in terms of probabilistic parameters around splice site regions. The probabilistic parameters are then fed into the SVM, which combines them nonlinearly to predict splice sites. When the proposed MM1-SVM model is compared with other existing standard splice site detection methods, it shows a superior performance in all the cases. CONCLUSION: We proposed an effective pre-processing scheme for the SVM and applied it for the identification of splice sites. This is a simple yet effective splice site detection method, which shows a better classification accuracy and computational speed than some other more complex methods

    Fast splice site detection using information content and feature reduction

    Get PDF
    Background: Accurate identification of splice sites in DNA sequences plays a key role in the prediction of gene structure in eukaryotes. Already many computational methods have been proposed for the detection of splice sites and some of them showed high prediction accuracy. However, most of these methods are limited in terms of their long computation time when applied to whole genome sequence data. Results: In this paper we propose a hybrid algorithm which combines several effective and informative input features with the state of the art support vector machine (SVM). To obtain the input features we employ information content method based on Shannon\u27s information theory, Shapiro\u27s score scheme, and Markovian probabilities. We also use a feature elimination scheme to reduce the less informative features from the input data. Conclusion: In this study we propose a new feature based splice site detection method that shows improved acceptor and donor splice site detection in DNA sequences when the performance is compared with various state of the art and well known method

    Minimal information for studies of extracellular vesicles 2018 (MISEV2018): a position statement of the International Society for Extracellular Vesicles and update of the MISEV2014 guidelines

    Get PDF
    The last decade has seen a sharp increase in the number of scientific publications describing physiological and pathological functions of extracellular vesicles (EVs), a collective term covering various subtypes of cell-released, membranous structures, called exosomes, microvesicles, microparticles, ectosomes, oncosomes, apoptotic bodies, and many other names. However, specific issues arise when working with these entities, whose size and amount often make them difficult to obtain as relatively pure preparations, and to characterize properly. The International Society for Extracellular Vesicles (ISEV) proposed Minimal Information for Studies of Extracellular Vesicles (“MISEV”) guidelines for the field in 2014. We now update these “MISEV2014” guidelines based on evolution of the collective knowledge in the last four years. An important point to consider is that ascribing a specific function to EVs in general, or to subtypes of EVs, requires reporting of specific information beyond mere description of function in a crude, potentially contaminated, and heterogeneous preparation. For example, claims that exosomes are endowed with exquisite and specific activities remain difficult to support experimentally, given our still limited knowledge of their specific molecular machineries of biogenesis and release, as compared with other biophysically similar EVs. The MISEV2018 guidelines include tables and outlines of suggested protocols and steps to follow to document specific EV-associated functional activities. Finally, a checklist is provided with summaries of key points

    Minimal information for studies of extracellular vesicles 2018 (MISEV2018): a position statement of the International Society for Extracellular Vesicles and update of the MISEV2014 guidelines

    Get PDF

    Splice site identification using probabilistic parameters and SVM classification

    No full text
    Abstract Background Recent advances and automation in DNA sequencing technology has created a vast amount of DNA sequence data. This increasing growth of sequence data demands better and efficient analysis methods. Identifying genes in this newly accumulated data is an important issue in bioinformatics, and it requires the prediction of the complete gene structure. Accurate identification of splice sites in DNA sequences plays one of the central roles of gene structural prediction in eukaryotes. Effective detection of splice sites requires the knowledge of characteristics, dependencies, and relationship of nucleotides in the splice site surrounding region. A higher-order Markov model is generally regarded as a useful technique for modeling higher-order dependencies. However, their implementation requires estimating a large number of parameters, which is computationally expensive. Results The proposed method for splice site detection consists of two stages: a first order Markov model (MM1) is used in the first stage and a support vector machine (SVM) with polynomial kernel is used in the second stage. The MM1 serves as a pre-processing step for the SVM and takes DNA sequences as its input. It models the compositional features and dependencies of nucleotides in terms of probabilistic parameters around splice site regions. The probabilistic parameters are then fed into the SVM, which combines them nonlinearly to predict splice sites. When the proposed MM1-SVM model is compared with other existing standard splice site detection methods, it shows a superior performance in all the cases. Conclusion We proposed an effective pre-processing scheme for the SVM and applied it for the identification of splice sites. This is a simple yet effective splice site detection method, which shows a better classification accuracy and computational speed than some other more complex methods.</p

    Expanded complement of Niemann-Pick type C2-like protein genes in Clonorchis sinensis suggests functions beyond sterol binding and transport

    Get PDF
    BACKGROUND: The parasitic flatworm Clonorchis sinensis inhabits the biliary tree of humans and other piscivorous mammals. This parasite can survive and thrive in the bile duct, despite exposure to bile constituents and host immune attack. Although the precise biological mechanisms underlying this adaptation are unknown, previous work indicated that Niemann-pick type C2 (NPC2)-like sterol-binding proteins might be integral in the host-parasite interplay. Expansions of this family in some invertebrates, such as arthropods, have shown functional diversification, including novel forms of chemoreception. Thus, here we curated the NPC2-like protein gene complement in C. sinensis, and predicted their conserved and/or divergent functional roles. METHODS: We used an established comparative genomic-bioinformatic approach to curate NPC2-like proteins encoded in published genomes of Korean and Chinese isolates of C. sinensis. Protein sequence and structural homology, presence of conserved domains and phylogeny were used to group and functionally classify NPC2-like proteins. Furthermore, transcription levels of NPC2-like protein-encoding genes were explored in different developmental stages and tissues. RESULTS: Totals of 35 and 32 C. sinensis NPC2-like proteins were predicted to be encoded in the genomes of the Korean and Chinese isolates, respectively. Overall, these proteins had low sequence homology and high variability of sequence alignment coverage when compared with curated NPC2s. Most C. sinensis proteins were predicted to retain a conserved ML domain and a conserved fold conformation, with a large cavity within the protein. Only one protein sequence retained the conserved amino acid residues required in bovine NPC2 to bind cholesterol. Non-canonical C. sinensis NPC2-like protein-coding domains clustered into four distinct phylogenetic groups with members of a group frequently encoded on the same genome scaffolds. Interestingly, NPC2-like protein-encoding genes were predicted to be variably transcribed in different developmental stages and adult tissues, with most being transcribed in the metacercarial stage. CONCLUSIONS: The results of the present investigation confirms an expansion of NPC2-like proteins in C. sinensis, suggesting a diverse array of functions beyond sterol binding and transport. Functional explorations of this protein family should elucidate the mechanisms enabling the establishment and survival of C. sinensis and related flukes in the biliary systems of mammalian hosts

    Nanopore Sequencing Resolves Elusive Long Tandem-Repeat Regions in Mitochondrial Genomes

    Get PDF
    Long non-coding, tandem-repetitive regions in mitochondrial (mt) genomes of many metazoans have been notoriously difficult to characterise accurately using conventional sequencing methods. Here, we show how the use of a third-generation (long-read) sequencing and informatic approach can overcome this problem. We employed Oxford Nanopore technology to sequence genomic DNAs from a pool of adult worms of the carcinogenic parasite, Schistosoma haematobium, and used an informatic workflow to define the complete mt non-coding region(s). Using long-read data of high coverage, we defined six dominant mt genomes of 33.4 kb to 22.6 kb. Although no variation was detected in the order or lengths of the protein-coding genes, there was marked length (18.5 kb to 7.6 kb) and structural variation in the non-coding region, raising questions about the evolution and function of what might be a control region that regulates mt transcription and/or replication. The discovery here of the largest tandem-repetitive, non-coding region (18.5 kb) in a metazoan organism also raises a question about the completeness of some of the mt genomes of animals reported to date, and stimulates further explorations using a Nanopore-informatic workflow

    Screening of the 'Stasis Box' identifies two kinase inhibitors under pharmaceutical development with activity against Haemonchus contortus

    Get PDF
    BACKGROUND: In partnership with the Medicines for Malaria Venture (MMV), we screened a collection ('Stasis Box') of 400 compounds (which have been in clinical development but have not been approved for illnesses other than neglected infectious diseases) for inhibitory activity against Haemonchus contortus, in order to attempt to repurpose some of the compounds to parasitic nematodes. METHODS: We assessed the inhibition of compounds on the motility and/or development of exsheathed third-stage (xL3s) and fourth-stage (L4) larvae of H. contortus using a whole-organism screening assay. RESULTS: In the primary screen, we identified compound MMV690767 (also known as SNS-032) that inhibited xL3 motility by ~70% at a concentration of 20 μM after 72 h as well as compound MMV079840 (also known as AG-1295), which induced a coiled xL3 phenotype, with ~50% inhibition on xL3 motility. Subsequently, we showed that SNS-032 (IC50 = 12.4 μM) and AG-1295 (IC50 = 9.92 ± 1.86 μM) had a similar potency to inhibit xL3 motility. Although neither SNS-032 nor AG-1295 had a detectable inhibitory activity on L4 motility, both compounds inhibited L4 development (IC50 values = 41.24 μM and 7.75 ± 0.94 μM for SNS-032 and AG-1295, respectively). The assessment of the two compounds for toxic effects on normal human breast epithelial (MCF10A) cells revealed that AG-1295 had limited cytotoxicity (IC50 > 100 μM), whereas SNS-032 was quite toxic to the epithelial cells (IC50 = 1.27 μM). CONCLUSIONS: Although the two kinase inhibitors, SNS-032 and AG-1295, had moderate inhibitory activity on the motility or development of xL3s or L4s of H. contortus in vitro, further work needs to be undertaken to chemically alter these entities to achieve the potency and selectivity required for them to become nematocidal or nematostatic candidates

    Synthetic Kavalactone Analogues with Increased Potency and Selective Anthelmintic Activity against Larvae of Haemonchus contortus In Vitro

    Get PDF
    Kava extract, an aqueous rhizome emulsion of the plant Piper methysticum, has been used for centuries by Pacific Islanders as a ceremonial beverage, and has been sold as an anxiolytic agent for some decades. Kavalactones are a major constituent of kava extract. In a previous investigation, we had identified three kavalactones that inhibit larval development of Haemonchus contortus in an in vitro-bioassay. In the present study, we synthesized two kavalactones, desmethoxyyangonin and yangonin, as well as 17 analogues thereof, and evaluated their anthelmintic activities using the same bioassay as employed previously. Structure activity relationship (SAR) studies showed that a 4-substituent on the pendant aryl ring was required for activity. In particular, compounds with 4-trifluoromethoxy, 4-difluoromethoxy, 4-phenoxy, and 4-N-morpholine substitutions had anthelmintic activities (IC50 values in the range of 1.9 to 8.9 µM) that were greater than either of the parent natural products-desmethoxyyangonin (IC50 of 37.1 µM) and yangonin (IC50 of 15.0 µM). The synthesized analogues did not exhibit toxicity on HepG2 human hepatoma cells in vitro at concentrations of up to 40 µM. These findings confirm the previously-identified kavalactone scaffold as a promising chemotype for new anthelmintics and provide a basis for a detailed SAR investigation focused on developing a novel anthelmintic agent

    Natural Compounds from the Marine Brown Alga Caulocystis cephalornithos with Potent In Vitro-Activity against the Parasitic Nematode Haemonchus contortus

    Get PDF
    Eight secondary metabolites (1 to 8) were isolated from a marine sponge, a marine alga and three terrestrial plants collected in Australia and subsequently chemically characterised. Here, these natural product-derived compounds were screened for in vitro-anthelmintic activity against the larvae and adult stages of Haemonchus contortus (barber's pole worm)-a highly pathogenic parasitic nematode of ruminants. Using an optimised, whole-organism screening system, compounds were tested on exsheathed third-stage larvae (xL3s) and fourth-stage larvae (L4s). Anthelmintic activity was initially evaluated on these stages based on the inhibition of motility, development and/or changes in morphology (phenotype). We identified two compounds, 6-undecylsalicylic acid (3) and 6-tridecylsalicylic acid (4) isolated from the marine brown alga, Caulocystis cephalornithos, with inhibitory effects on xL3 and L4 motility and larval development, and the induction of a "skinny-straight" phenotype. Subsequent testing showed that these two compounds had an acute nematocidal effect (within 1-12 h) on adult males and females of H. contortus. Ultrastructural analysis of adult worms treated with compound 4 revealed significant damage to subcuticular musculature and associated tissues and cellular organelles including mitochondria. In conclusion, the present study has discovered two algal compounds possessing acute anthelmintic effects and with potential for hit-to-lead progression. Future work should focus on undertaking a structure-activity relationship study and on elucidating the mode(s) of action of optimised compounds
    corecore