172 research outputs found

    A uniquely African focus: bioinformatics is a field that has grown exponentially in the past few years, and it is becoming increasingly important to collaborate as the field continues to gather momentum

    Get PDF
    Bioinformatics is a field that has grown exponentially in the past few years, and it is becoming increasingly important to collaborate as the field continues to gather momentum

    Molecular evolution of key receptor genes in primates and non-human primates

    Get PDF
    African primates remain an unexplored source of information required to complete the origin and evolution of many human pathogens. Current studies have shown the importance of several receptor human genes implicated in host resistance or susceptibility to tuberculosis. The validation of these genes in Mycobacterium tuberculosis infection makes them an excellent model system to investigate the mode of selective pressures that may act on pathogen defense genes. To trace the evolutionary history of these genes, the report describes preliminary results for eight receptors human genes having either a significant or a possible association with Tuberculosis (TB). By using a combination of maximum likelihood approaches, evidence of positive selection were detected for four genes. The analysis between species, nevertheless, shows a clear pattern of nucleotide variation mostly compatible with purifying selection.South African Research Chair Initiative (DST) National Research Foundation of South Afric

    MicroRNA profile of Hermetia illucens (black soldier fly) and its implications on mass rearing

    Get PDF
    The growing demands on protein producers and the dwindling available resources have made Hermetia illucens (the black soldier fly, BSF) an economically important species. Insights into the genome of this insect will better allow for robust breeding protocols, and more efficient production to be used as a replacement of animal feed protein. The use of microRNA as a method to understand how gene regulation allows insect species to adapt to changes in their environment, has been established in multiple species. The baseline and life stage expression levels established in this study, allow for insight into the development and sex-linked microRNA regulation in BSF. To accomplish this, microRNA was extracted and sequenced from 15 different libraries with each life stage in triplicate. Of the total 192 microRNAs found, 168 were orthologous to known arthropod microRNAs and 24 microRNAs were unique to BSF. Twenty-six of the 168 microRNAs conserved across arthropods had a statistically significant (p < 0.05) differential expression between Egg to Larval stages. The development from larva to pupa was characterized by 16 statistically significant differentially expressed microRNA. Seven and 9 microRNA were detected as statistically significant between pupa to adult female and pupa to adult male, respectively. All life stages had a nearly equal split between up and down regulated microRNAs. Ten of the unique 24 miRNA were detected exclusively in one life stage. The egg life stage expressed five microRNA (hil-miR-m, hil-miR-p, hil-miR-r, hil-miR-s, and hil-miR-u) not seen in any other life stages. The female adult and pupa life stages expressed one miRNA each hil-miR-h and hil-miR-ac respectively. Both male and female adult life stages expressed hil-miR-a, hil-miR-b, and hil-miR-y. There were no unique microRNAs found only in the larva stage. Twenty-two microRNAs with 56 experimentally validated target genes in the closely related Drosophila melanogaster were identified. Thus, the microRNA found display the unique evolution of BSF, along with the life stages and potential genes to target for robust mass rearing. Understanding of the microRNA expression in BSF will further their use in the crucial search for alternative and sustainable protein sources

    Inferring bona fide transfrags in RNA-Seq derived-transcriptome assemblies of non-model organisms

    Get PDF
    Background: De novo transcriptome assembly of short transcribed fragments (transfrags) produced from sequencing-by-synthesis technologies often results in redundant datasets with differing levels of unassembled, partially assembled or mis-assembled transcripts. Post-assembly processing intended to reduce redundancy typically involves reassembly or clustering of assembled sequences. However, these approaches are mostly based on common word heuristics and often create clusters of biologically unrelated sequences, resulting in loss of unique transfrags annotations and propagation of mis-assemblies. Results: Here, we propose a structured framework that consists of a few steps in pipeline architecture for Inferring Functionally Relevant Assembly-derived Transcripts (IFRAT). IFRAT combines 1) removal of identical subsequences, 2) error tolerant CDS prediction, 3) identification of coding potential, and 4) complements BLAST with a multiple domain architecture annotation that reduces non-specific domain annotation. We demonstrate that independent of the assembler, IFRAT selects bona fide transfrags (with CDS and coding potential) from the transcriptome assembly of a model organism without relying on post-assembly clustering or reassembly. The robustness of IFRAT is inferred on RNA-Seq data of Neurospora crassa assembled using de Bruijn graph-based assemblers, in single (Trinity and Oases-25) and multiple (Oases-Merge and additive or pooled) k-mer modes. Single k-mer assemblies contained fewer transfrags compared to the multiple k-mer assemblies. However, Trinity identified a comparable number of predicted coding sequence and gene loci to Oases pooled assembly. IFRAT selects bona fide transfrags representing over 94% of cumulative BLAST-derived functional annotations of the unfiltered assemblies. Between 4-6% are lost when orphan transfrags are excluded and this represents only a tiny fraction of annotation derived from functional transference by sequence similarity. The median length of bona fide transfrags ranged from 1.5kb (Trinity) to 2kb (Oases), which is consistent with the average coding sequence length in fungi. The fraction of transfrags that could be associated with gene ontology terms ranged from 33-50%, which is also high for domain based annotation. We showed that unselected transfrags were mostly truncated and represent sequences from intronic, untranslated (5′ and 3′) regions and non-coding gene loci. Conclusions: IFRAT simplifies post-assembly processing providing a reference transcriptome enriched with functionally relevant assembly-derived transcripts for non-model organism.Department of Science and Technology National Research Foundation South African Research Chair initiativeWeb of Scienc

    A glance at quality score: implication for de novo transcriptome reconstruction of Illumina reads

    Get PDF
    Downstream analyses of short-reads from next-generation sequencing platforms are often preceded by a pre-processing step that removes uncalled and wrongly called bases. Standard approaches rely on their associated base quality scores to retain the read or a portion of it when the score is above a predefined threshold. It is difficult to differentiate sequencing error from biological variation without a reference using a quality score. The effects of quality score based trimming have not been systematically studied in de novo transcriptome assembly. Using RNA-Seq data produced from Illumina,we teased out the effects of quality score based filtering or trimming on de novo transcriptome reconstruction. We showed that assemblies produced from reads subjected to different quality score thresholds contain truncated and missing transfrags when compared to those from untrimmed reads. Our data supports the fact that de novo assembling of untrimmed data is challenging for de Bruijn graph assemblers. However, our results indicates that comparing the assemblies from untrimmed and trimmed read subsets can suggest appropriate filtering parameters and enables election of the optimum de novo transcriptome assembly in non-model organisms.South African Research Chair Initiative National Research Foundation of South Afric

    Careful governance of African biobanks

    Get PDF
    The Sydney Statement is one of the first framing documents on the principles for guiding global health security. Framing matters because the funding pool for development assistance for health is finite and has plateaued over the past decade.2,3 Investments in global health security to prevent future catastrophes are subject to competing health priorities, such as scaling up the “most essential interventions” against ongoing epidemics of preventable morbidity and mortality in mothers, infants, and children in the Global South.4 Development assistance for health that prioritises global health security could overwhelm or detract attention from multiple competing health priorities.

    Resistance related metabolic pathways for drug target identification in Mycobacterium tuberculosis

    Get PDF
    Criteria used to filter high priority M.tuberculosis drug targets. The genes highlighted in bold satisfied all the selection criteria. The hyphen (−) indicates exclusion from further analysis. Abbreviations used: NUI- Not under investigation, PDB- Protein Data Bank, TBSGC- TB Structural Genome Consortium. References 12-Sassetti et al., 2003; 34-Lamichhane et al., 2003. Data can be viewed in Microsoft excel. (XLS 12 kb

    Using multiplex amplicon pcr technology to efficiently and timely generate rift valley fever virus sequence data for genomic surveillance

    Get PDF
    Rift Valley fever (RVF) is a febrile vector-borne disease endemic in Africa and continues to spread in new territories. It is a climate-sensitive disease mostly triggered by abnormal rainfall patterns. The disease is associated with high mortality and morbidity in both humans and livestock. RVF is caused by the Rift Valley fever virus (RVFV) of the genus Phlebovirus in the family Phenuiviridae. It is a tripartite RNA virus with three genomic segments: small (S), medium (M) and large (L). Pathogen genomic sequencing is becoming a routine procedure and a powerful tool for understanding the evolutionary dynamics of infectious organisms, including viruses. Inspired by the utility of amplicon-based sequencing demonstrated in severe acute respiratory syndrome coronavirus-2 (SARS-CoV-2) and Ebola, Zika and West Nile viruses, we report an RVFV sample preparation based on amplicon multiplex polymerase chain reaction (amPCR) for template enrichment and reduction of background host contamination. The technology can be implemented rapidly to characterize and genotype RVFV during outbreaks in a near-real-time manner. To achieve this, we designed 74 multiplex primer sets covering the entire RVFV genome to specifically amplify the nucleic acid of RVFV in clinical samples from an animal tissue. Using this approach, we demonstrate achieving complete RVFV genome coverage even from samples containing a relatively low viral load. We report the first primer scheme approach of generating multiplex primer sets for a tripartite virus which can be replicated for other segmented viruses

    Evolution and structural analysis of Glossina morsitans (Diptera; Glossinidae) Tetraspanins

    Get PDF
    Tetraspanins are important conserved integral membrane proteins expressed in many organisms. Although there is limited knowledge about the full repertoire, evolution and structural characteristics of individual members in various organisms, data obtained so far show that tetraspanins play major roles in membrane biology, visual processing, memory, olfactory signal processing, and mechanosensory antennal inputs. Thus, these proteins are potential targets for control of insect pests. Here, we report that the genome of the tsetse fly, Glossina morsitans (Diptera: Glossinidae) encodes at least seventeen tetraspanins (GmTsps), all containing the signature features found in the tetraspanin superfamily members. Whereas six of the GmTsps have been previously reported, eleven could be classified as novel because their amino acid sequences do not map to characterized tetraspanins in the available protein data bases. We present a model of the GmTsps by using GmTsp42Ed, whose presence and expression has been recently detected by transcriptomics and proteomics analyses of G. morsitans. Phylogenetically, the identified GmTsps segregate into three major clusters. Structurally, the GmTsps are largely similar to vertebrate tetraspanins. In view of the exploitation of tetraspanins by organisms for survival, these proteins could be targeted using specific antibodies, recombinant large extracellular loop (LEL) domains, small-molecule mimetics and siRNAs as potential novel and efficacious putative targets to combat African trypanosomiasis by killing the tsetse fly vector
    corecore