189 research outputs found

    PoPoolation: A Toolbox for Population Genetic Analysis of Next Generation Sequencing Data from Pooled Individuals

    Get PDF
    Recent statistical analyses suggest that sequencing of pooled samples provides a cost effective approach to determine genome-wide population genetic parameters. Here we introduce PoPoolation, a toolbox specifically designed for the population genetic analysis of sequence data from pooled individuals. PoPoolation calculates estimates of θWatterson, θπ, and Tajima's D that account for the bias introduced by pooling and sequencing errors, as well as divergence between species. Results of genome-wide analyses can be graphically displayed in a sliding window plot. PoPoolation is written in Perl and R and it builds on commonly used data formats. Its source code can be downloaded from http://code.google.com/p/popoolation/. Furthermore, we evaluate the influence of mapping algorithms, sequencing errors, and read coverage on the accuracy of population genetic parameter estimates from pooled data

    Predictive models for anti-tubercular molecules using machine learning on high-throughput biological screening datasets

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Tuberculosis is a contagious disease caused by <it>Mycobacterium tuberculosis </it>(Mtb), affecting more than two billion people around the globe and is one of the major causes of morbidity and mortality in the developing world. Recent reports suggest that Mtb has been developing resistance to the widely used anti-tubercular drugs resulting in the emergence and spread of multi drug-resistant (MDR) and extensively drug-resistant (XDR) strains throughout the world. In view of this global epidemic, there is an urgent need to facilitate fast and efficient lead identification methodologies. Target based screening of large compound libraries has been widely used as a fast and efficient approach for lead identification, but is restricted by the knowledge about the target structure. Whole organism screens on the other hand are target-agnostic and have been now widely employed as an alternative for lead identification but they are limited by the time and cost involved in running the screens for large compound libraries. This could be possibly be circumvented by using computational approaches to prioritize molecules for screening programmes.</p> <p>Results</p> <p>We utilized physicochemical properties of compounds to train four supervised classifiers (Naïve Bayes, Random Forest, J48 and SMO) on three publicly available bioassay screens of Mtb inhibitors and validated the robustness of the predictive models using various statistical measures.</p> <p>Conclusions</p> <p>This study is a comprehensive analysis of high-throughput bioassay data for anti-tubercular activity and the application of machine learning approaches to create target-agnostic predictive models for anti-tubercular agents.</p

    The Formation of the First Massive Black Holes

    Full text link
    Supermassive black holes (SMBHs) are common in local galactic nuclei, and SMBHs as massive as several billion solar masses already exist at redshift z=6. These earliest SMBHs may grow by the combination of radiation-pressure-limited accretion and mergers of stellar-mass seed BHs, left behind by the first generation of metal-free stars, or may be formed by more rapid direct collapse of gas in rare special environments where dense gas can accumulate without first fragmenting into stars. This chapter offers a review of these two competing scenarios, as well as some more exotic alternative ideas. It also briefly discusses how the different models may be distinguished in the future by observations with JWST, (e)LISA and other instruments.Comment: 47 pages with 306 references; this review is a chapter in "The First Galaxies - Theoretical Predictions and Observational Clues", Springer Astrophysics and Space Science Library, Eds. T. Wiklind, V. Bromm & B. Mobasher, in pres

    Small-molecule inhibition of METTL3 as a strategy against myeloid leukaemia.

    Get PDF
    N6-methyladenosine (m6A) is an abundant internal RNA modification1,2 that is catalysed predominantly by the METTL3-METTL14 methyltransferase complex3,4. The m6A methyltransferase METTL3 has been linked to the initiation and maintenance of acute myeloid leukaemia (AML), but the potential of therapeutic applications targeting this enzyme remains unknown5-7. Here we present the identification and characterization of STM2457, a highly potent and selective first-in-class catalytic inhibitor of METTL3, and a crystal structure of STM2457 in complex with METTL3-METTL14. Treatment of tumours with STM2457 leads to reduced AML growth and an increase in differentiation and apoptosis. These cellular effects are accompanied by selective reduction of m6A levels on known leukaemogenic mRNAs and a decrease in their expression consistent with a translational defect. We demonstrate that pharmacological inhibition of METTL3 in vivo leads to impaired engraftment and prolonged survival in various mouse models of AML, specifically targeting key stem cell subpopulations of AML. Collectively, these results reveal the inhibition of METTL3 as a potential therapeutic strategy against AML, and provide proof of concept that the targeting of RNA-modifying enzymes represents a promising avenue for anticancer therapy

    Identification of Novel Pathogenicity Loci in Clostridium perfringens Strains That Cause Avian Necrotic Enteritis

    Get PDF
    Type A Clostridium perfringens causes poultry necrotic enteritis (NE), an enteric disease of considerable economic importance, yet can also exist as a member of the normal intestinal microbiota. A recently discovered pore-forming toxin, NetB, is associated with pathogenesis in most, but not all, NE isolates. This finding suggested that NE-causing strains may possess other virulence gene(s) not present in commensal type A isolates. We used high-throughput sequencing (HTS) technologies to generate draft genome sequences of seven unrelated C. perfringens poultry NE isolates and one isolate from a healthy bird, and identified additional novel NE-associated genes by comparison with nine publicly available reference genomes. Thirty-one open reading frames (ORFs) were unique to all NE strains and formed the basis for three highly conserved NE-associated loci that we designated NELoc-1 (42 kb), NELoc-2 (11.2 kb) and NELoc-3 (5.6 kb). The largest locus, NELoc-1, consisted of netB and 36 additional genes, including those predicted to encode two leukocidins, an internalin-like protein and a ricin-domain protein. Pulsed-field gel electrophoresis (PFGE) and Southern blotting revealed that the NE strains each carried 2 to 5 large plasmids, and that NELoc-1 and -3 were localized on distinct plasmids of sizes ∼85 and ∼70 kb, respectively. Sequencing of the regions flanking these loci revealed similarity to previously characterized conjugative plasmids of C. perfringens. These results provide significant insight into the pathogenetic basis of poultry NE and are the first to demonstrate that netB resides in a large, plasmid-encoded locus. Our findings strongly suggest that poultry NE is caused by several novel virulence factors, whose genes are clustered on discrete pathogenicity loci, some of which are plasmid-borne

    High-throughput 454 resequencing for allele discovery and recombination mapping in Plasmodium falciparum

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Knowledge of the origins, distribution, and inheritance of variation in the malaria parasite (<it>Plasmodium falciparum</it>) genome is crucial for understanding its evolution; however the 81% (A+T) genome poses challenges to high-throughput sequencing technologies. We explore the viability of the Roche 454 Genome Sequencer FLX (GS FLX) high throughput sequencing technology for both whole genome sequencing and fine-resolution characterization of genetic exchange in malaria parasites.</p> <p>Results</p> <p>We present a scheme to survey recombination in the haploid stage genomes of two sibling parasite clones, using whole genome pyrosequencing that includes a sliding window approach to predict recombination breakpoints. Whole genome shotgun (WGS) sequencing generated approximately 2 million reads, with an average read length of approximately 300 bp. <it>De novo </it>assembly using a combination of WGS and 3 kb paired end libraries resulted in contigs ≤ 34 kb. More than 8,000 of the 24,599 SNP markers identified between parents were genotyped in the progeny, resulting in a marker density of approximately 1 marker/3.3 kb and allowing for the detection of previously unrecognized crossovers (COs) and many non crossover (NCO) gene conversions throughout the genome.</p> <p>Conclusions</p> <p>By sequencing the 23 Mb genomes of two haploid progeny clones derived from a genetic cross at more than 30× coverage, we captured high resolution information on COs, NCOs and genetic variation within the progeny genomes. This study is the first to resequence progeny clones to examine fine structure of COs and NCOs in malaria parasites.</p

    Next-gen sequencing identifies non-coding variation disrupting miRNA-binding sites in neurological disorders

    Get PDF
    Understanding the genetic factors underlying neurodevelopmental and neuropsychiatric disorders is a major challenge given their prevalence and potential severity for quality of life. While large-scale genomic screens have made major advances in this area, for many disorders the genetic underpinnings are complex and poorly understood. To date the field has focused predominantly on protein coding variation, but given the importance of tightly controlled gene expression for normal brain development and disorder, variation that affects non-coding regulatory regions of the genome is likely to play an important role in these phenotypes. Herein we show the importance of 3 prime untranslated region (3'UTR) non-coding regulatory variants across neurodevelopmental and neuropsychiatric disorders. We devised a pipeline for identifying and functionally validating putatively pathogenic variants from next generation sequencing (NGS) data. We applied this pipeline to a cohort of children with severe specific language impairment (SLI) and identified a functional, SLI-associated variant affecting gene regulation in cells and post-mortem human brain. This variant and the affected gene (ARHGEF39) represent new putative risk factors for SLI. Furthermore, we identified 3'UTR regulatory variants across autism, schizophrenia and bipolar disorder NGS cohorts demonstrating their impact on neurodevelopmental and neuropsychiatric disorders. Our findings show the importance of investigating non-coding regulatory variants when determining risk factors contributing to neurodevelopmental and neuropsychiatric disorders. In the future, integration of such regulatory variation with protein coding changes will be essential for uncovering the genetic causes of complex neurological disorders and the fundamental mechanisms underlying health and disease
    corecore