19 research outputs found

    Genome-guided methods for discovering new natural product from fungi

    Get PDF
    For decades, fungi have been an important source of medically relevant natural products (NPs). Recent advances in DNA sequencing have revealed that the biosynthetic potential of fungal genomes is much deeper than previously realized. Difficulties in culturing and genetically engineering many fungi, combined with the fact that many NP biosynthetic gene clusters (BGCs) are not expressed under standard laboratory conditions has lead to much of this biosynthetic potential remaining untapped. Here we describe the realization of a pipeline based in S. cerevisiae encompassing bioinformatic tools for BGC curation, genetic parts for BGC refactoring, and improved DNA assembly for BGC building. With this pipeline, we have successfully detected novel NPs from several previously unstudied fungal BGCs, and have structurally characterized a subset of the BGC-associated compounds. We also developed activity-guided methods to discover natural products of new function, and validated the biological activity using higher-order model systems. Our pipeline demonstrates how high-throughput synthetic biology tools can facilitate the rapid discovery of complex chemical scaffolds of potential pharmaceutical relevance and their production in model fungal hosts

    Evolution of chemical diversity by coordinated gene swaps in type II polyketide gene clusters

    Get PDF
    Natural product biosynthetic pathways generate molecules of enormous structural complexity and exquisitely tuned biological activities. Studies of natural products have led to the discovery of many pharmaceutical agents, particularly antibiotics. Attempts to harness the catalytic prowess of biosynthetic enzyme systems, for both compound discovery and engineering, have been limited by a poor understanding of the evolution of the underlying gene clusters. We developed an approach to study the evolution of biosynthetic genes on a cluster-wide scale, integrating pairwise gene coevolution information with large-scale phylogenetic analysis. We used this method to infer the evolution of type II polyketide gene clusters, tracing the path of evolution from the single ancestor to those gene clusters surviving today. We identified 10 key gene types in these clusters, most of which were swapped in from existing cellular processes and subsequently specialized. The ancestral type II polyketide gene cluster likely comprised a core set of five genes, a roster that expanded and contracted throughout evolution. A key C24 ancestor diversified into major classes of longer and shorter chain length systems, from which a C20 ancestor gave rise to the majority of characterized type II polyketide antibiotics. Our findings reveal that (i) type II polyketide structure is predictable from its gene roster, (ii) only certain gene combinations are compatible, and (iii) gene swaps were likely a key to evolution of chemical diversity. The lessons learned about how natural selection drives polyketide chemical innovation can be applied to the rational design and guided discovery of chemicals with desired structures and properties

    Systematic analysis of genome-wide fitness data in yeast reveals novel gene function and drug action

    Get PDF
    The relationship between co-fitness and co-inhibition of genes in chemicogenomic yeast screens provides insights into gene function and drug target prediction

    Differential gene expression in abdomens of the malaria vector mosquito, Anopheles gambiae, after sugar feeding, blood feeding and Plasmodium berghei infection

    Get PDF
    BACKGROUND: Large scale sequencing of cDNA libraries can provide profiles of genes expressed in an organism under defined biological and environmental circumstances. We have analyzed sequences of 4541 Expressed Sequence Tags (ESTs) from 3 different cDNA libraries created from abdomens from Plasmodium infection-susceptible adult female Anopheles gambiae. These libraries were made from sugar fed (S), rat blood fed (RB), and P. berghei-infected (IRB) mosquitoes at 30 hours after the blood meal, when most parasites would be transforming ookinetes or very early oocysts. RESULTS: The S, RB and IRB libraries contained 1727, 1145 and 1669 high quality ESTs, respectively, averaging 455 nucleotides (nt) in length. They assembled into 1975 consensus sequences – 567 contigs and 1408 singletons. Functional annotation was performed to annotate probable molecular functions of the gene products and the biological processes in which they function. Genes represented at high frequency in one or more of the libraries were subjected to digital Northern analysis and results on expression of 5 verified by qRT-PCR. CONCLUSION: 13% of the 1965 ESTs showing identity to the A. gambiae genome sequence represent novel genes. These, together with untranslated regions (UTR) present on many of the ESTs, will inform further genome annotation. We have identified 23 genes encoding products likely to be involved in regulating the cellular oxidative environment and 25 insect immunity genes. We also identified 25 genes as being up or down regulated following blood feeding and/or feeding with P. berghei infected blood relative to their expression levels in sugar fed females

    Update of the Anopheles gambiae PEST genome assembly

    Get PDF
    BACKGROUND: The genome of Anopheles gambiae, the major vector of malaria, was sequenced and assembled in 2002. This initial genome assembly and analysis made available to the scientific community was complicated by the presence of assembly issues, such as scaffolds with no chromosomal location, no sequence data for the Y chromosome, haplotype polymorphisms resulting in two different genome assemblies in limited regions and contaminating bacterial DNA. RESULTS: Polytene chromosome in situ hybridization with cDNA clones was used to place 15 unmapped scaffolds (sizes totaling 5.34 Mbp) in the pericentromeric regions of the chromosomes and oriented a further 9 scaffolds. Additional analysis by in situ hybridization of bacterial artificial chromosome (BAC) clones placed 1.32 Mbp (5 scaffolds) in the physical gaps between scaffolds on euchromatic parts of the chromosomes. The Y chromosome sequence information (0.18 Mbp) remains highly incomplete and fragmented among 55 short scaffolds. Analysis of BAC end sequences showed that 22 inter-scaffold gaps were spanned by BAC clones. Unmapped scaffolds were also aligned to the chromosome assemblies in silico, identifying regions totaling 8.18 Mbp (144 scaffolds) that are probably represented in the genome project by two alternative assemblies. An additional 3.53 Mbp of alternative assembly was identified within mapped scaffolds. Scaffolds comprising 1.97 Mbp (679 small scaffolds) were identified as probably derived from contaminating bacterial DNA. In total, about 33% of previously unmapped sequences were placed on the chromosomes. CONCLUSION: This study has used new approaches to improve the physical map and assembly of the A. gambiae genome

    Comparative Analysis of the Global Transcriptome of Anopheles funestus from Mali, West Africa

    Get PDF
    Background: Anopheles funestus is a principal vector of malaria across much of tropical Africa and is considered one of the most efficient of its kind, yet studies of this species have lagged behind those of its broadly sympatric congener, An. gambiae. In aid of future genomic sequencing of An. funestus, we explored the whole body transcriptome, derived from mixed stage progeny of wild-caught females from Mali, West Africa. Principal Findings: Here we report the functional annotation and comparative genomics of 2,005 expressed sequence tags (ESTs) from An. funestus, which were assembled with a previous EST set from adult female salivary glands from the same mosquito. The assembled ESTs provided for a nonredundant catalog of 1,035 transcripts excluding mitochondrial sequences. Conclusions/Significance: Comparison of the An. funestus and An. gambiae transcriptomes using computational and macroarray approaches revealed a high degree of sequence identity despite an estimated 20–80 MY divergence time between lineages. A phylogenetically broader comparative genomic analysis indicated that the most rapidly evolving proteins – those involved in immunity, hematophagy, formation of extracellular structures, and hypothetical conserved proteins – are those that probably play important roles in how mosquitoes adapt to their nutritional and externa

    Identifying relationships between genes and small molecules, from yeast to humans

    No full text
    Small molecules such as metabolites, signaling molecules, and disease-treating drugs are essential for life; they have even been called the "missing link in the central dogma" of biology. There has been a recent explosion of data measuring whole-genome responses to small molecule perturbations in many organisms, but there has been a lack of research combining bioinformatics and chemical informatics to elucidate gene-drug relationships from these data. This thesis describes computational methods for finding and characterizing both functional (indirect) and physical (binding) interactions between small molecules and genes or proteins. In one project, we analyzed ∼6,000 single-gene deletion strains in yeast, grown in the presence of several hundred small molecule treatments. Interestingly, nearly all gene deletion strains revealed a defective growth phenotype in some condition, suggesting that nearly all genes are required for growth, and genetic redundancy is limited. We also identified a large set of multi-drug resistance genes, which were surprisingly involved primarily in membrane trafficking. In a second project, we refined predictions of functional interactions in this dataset into predictions of physical interactions, utilizing the chemical structures of the compounds and features of the genes in the phenotypic assay. We found that incorporating knowledge of functional interactions improved the predictions of physical interactions. We predicted novel binding interactions and found external evidence for predictions that certain FDA-approved psychoactive compounds may have a secondary target, Cox17. Finally, using publicly-available small molecule screening data from humans and other organisms, we learned small molecule binding sites on proteins, using only 1-dimensional protein sequence and small molecule structure. This allows inclusion of any sequenced protein, the vast majority of which do not have solved 3-dimensional structures, and it identifies the actual sites of interaction. In all three projects, the learned interactions between genes and small molecules reveal gene functions and small molecule targets, and they should improve understanding of both basic biology and drug discovery

    Mechanisms of Haploinsufficiency Revealed by Genome-Wide Profiling in Yeast

    No full text
    Haploinsufficiency is defined as a dominant phenotype in diploid organisms that are heterozygous for a loss-of-function allele. Despite its relevance to human disease, neither the extent of haploinsufficiency nor its precise molecular mechanisms are well understood. We used the complete set of Saccharomyces cerevisiae heterozygous deletion strains to survey the genome for haploinsufficiency via fitness profiling in rich (YPD) and minimal media to identify all genes that confer a haploinsufficient growth defect. This assay revealed that ∼3% of all ∼5900 genes tested are haploinsufficient for growth in YPD. This class of genes is functionally enriched for metabolic processes carried out by molecular complexes such as the ribosome. Much of the haploinsufficiency in YPD is alleviated by slowing the growth rate of each strain in minimal media, suggesting that certain gene products are rate limiting for growth only in YPD. Overall, our results suggest that the primary mechanism of haploinsufficiency in yeast is due to insufficient protein production. We discuss the relevance of our findings in yeast to human haploinsufficiency disorders
    corecore