19 research outputs found

    Mutation discovery in mice by whole exome sequencing

    Get PDF
    We report the development and optimization of reagents for in-solution, hybridization-based capture of the mouse exome. By validating this approach in a multiple inbred strains and in novel mutant strains, we show that whole exome sequencing is a robust approach for discovery of putative mutations, irrespective of strain background. We found strong candidate mutations for the majority of mutant exomes sequenced, including new models of orofacial clefting, urogenital dysmorphology, kyphosis and autoimmune hepatitis

    Exploiting Nucleotide Composition to Engineer Promoters

    Get PDF
    The choice of promoter is a critical step in optimizing the efficiency and stability of recombinant protein production in mammalian cell lines. Artificial promoters that provide stable expression across cell lines and can be designed to the desired strength constitute an alternative to the use of viral promoters. Here, we show how the nucleotide characteristics of highly active human promoters can be modelled via the genome-wide frequency distribution of short motifs: by overlapping motifs that occur infrequently in the genome, we constructed contiguous sequence that is rich in GC and CpGs, both features of known promoters, but lacking homology to real promoters. We show that snippets from this sequence, at 100 base pairs or longer, drive gene expression in vitro in a number of mammalian cells, and are thus candidates for use in protein production. We further show that expression is driven by the general transcription factors TFIIB and TFIID, both being ubiquitously present across cell types, which results in less tissue- and species-specific regulation compared to the viral promoter SV40. We lastly found that the strength of a promoter can be tuned up and down by modulating the counts of GC and CpGs in localized regions. These results constitute a “proof-of-concept” for custom-designing promoters that are suitable for biotechnological and medical applications

    Insights from the genome of the biotrophic fungal plant pathogen Ustilago maydis

    Get PDF
    Ustilago maydis is a ubiquitous pathogen of maize and a well-established model organism for the study of plant-microbe interactions. This basidiomycete fungus does not use aggressive virulence strategies to kill its host. U. maydis belongs to the group of biotrophic parasites (the smuts) that depend on living tissue for proliferation and development. Here we report the genome sequence for a member of this economically important group of biotrophic fungi. The 20.5-million-base U. maydis genome assembly contains 6,902 predicted protein-encoding genes and lacks pathogenicity signatures found in the genomes of aggressive pathogenic fungi, for example a battery of cell-wall-degrading enzymes. However, we detected unexpected genomic features responsible for the pathogenicity of this organism. Specifically, we found 12 clusters of genes encoding small secreted proteins with unknown function. A significant fraction of these genes exists in small gene families. Expression analysis showed that most of the genes contained in these clusters are regulated together and induced in infected tissue. Deletion of individual clusters altered the virulence of U. maydis in five cases, ranging from a complete lack of symptoms to hypervirulence. Despite years of research into the mechanism of pathogenicity in U. maydis, no 'true' virulence factors had been previously identified. Thus, the discovery of the secreted protein gene clusters and the functional demonstration of their decisive role in the infection process illuminate previously unknown mechanisms of pathogenicity operating in biotrophic fungi. Genomic analysis is, similarly, likely to open up new avenues for the discovery of virulence determinants in other pathogens. ©2006 Nature Publishing Group.J.K., M. B. and R.K. thank G. Sawers and U. Kämper for critical reading of the manuscript. The genome sequencing of Ustilago maydis strain 521 is part of the fungal genome initiative and was funded by National Human Genome Research Institute (USA) and BayerCropScience AG (Germany). F.B. was supported by a grant from the National Institutes of Health (USA). J.K. and R.K. thank the German Ministry of Education and Science (BMBF) for financing the DNA array setup and the Max Planck Society for their support of the manual genome annotation. F.B. was supported by a grant from the National Institutes of Health, B.J.S. was supported by the Natural Sciences and Engineering Research Council of Canada and the Canada Foundation for Innovation, J.W.K. received funding from the Natural Sciences and Engineering Research Council of Canada, J.R.-H. received funding from CONACYT, México, A.M.-M. was supported by a fellowship from the Humboldt Foundation, and L.M. was supported by an EU grant. Author Contributions All authors were involved in planning and executing the genome sequencing project. B.W.B., J.G., L.-J.M., E.W.M., D.D., C.M.W., J.B., S.Y., D.B.J., S.C., C.N., E.K., G.F., P.H.S., I.H.-H., M. Vaupel, H.V., T.S., J.M., D.P., C.S., A.G., F.C. and V. Vysotskaia contributed to the three independent sequencing projects; M.M., G.M., U.G., D.H., M.O. and H.-W.M. were responsible for gene model refinement, database design and database maintenance; G.M., J. Kämper, R.K., G.S., M. Feldbrügge, J.S., C.W.B., U.F., M.B., B.S., B.J.S., M.J.C., E.C.H.H., S.M., F.B., J.W.K., K.J.B., J. Klose, S.E.G., S.J.K., M.H.P., H.A.B.W., R.deV., H.J.D., J.R.-H., C.G.R.-P., L.O.-C., M.McC., K.S., J.P.-M., J.I.I., W.H., P.G., P.S.-A., M. Farman, J.E.S., R.S., J.M.G.-P., J.C.K., W.L. and D.H. were involved in functional annotation and interpretation; T.B., O.M., L.M., A.M.-M., D.G., K.M., N.R., V. Vincon, M. VraneŠ, M.S. and O.L. performed experiments. J. Kämper, R.K. and M.B. wrote and edited the paper with input from L.-J.M., J.G., F.B., J.W.K., B.J.S. and S.E.G. Individual contributions of authors can be found as Supplementary Notes

    A High-Resolution Map of Human Evolutionary Constraint Using 29 Mammals

    Get PDF
    The comparison of related genomes has emerged as a powerful lens for genome interpretation. Here we report the sequencing and comparative analysis of 29 eutherian genomes. We confirm that at least 5.5% of the human genome has undergone purifying selection, and locate constrained elements covering ~4.2% of the genome. We use evolutionary signatures and comparisons with experimental data sets to suggest candidate functions for ~60% of constrained bases. These elements reveal a small number of new coding exons, candidate stop codon readthrough events and over 10,000 regions of overlapping synonymous constraint within protein-coding exons. We find 220 candidate RNA structural families, and nearly a million elements overlapping potential promoter, enhancer and insulator regions. We report specific amino acid residues that have undergone positive selection, 280,000 non-coding elements exapted from mobile elements and more than 1,000 primate- and human-accelerated elements. Overlap with disease-associated variants indicates that our findings will be relevant for studies of human biology, health and disease.National Human Genome Research Institute (U.S.)National Institute of General Medical Sciences (U.S.) (Grant number GM82901)National Science Foundation (U.S.). Postdoctural Fellowship (Award 0905968)National Science Foundation (U.S.). Career (0644282)National Institutes of Health (U.S.) (R01-HG004037)Alfred P. Sloan Foundation.Austrian Science Fund. Erwin Schrodinger Fellowshi

    Finishing the euchromatic sequence of the human genome

    Get PDF
    The sequence of the human genome encodes the genetic instructions for human physiology, as well as rich information about human evolution. In 2001, the International Human Genome Sequencing Consortium reported a draft sequence of the euchromatic portion of the human genome. Since then, the international collaboration has worked to convert this draft into a genome sequence with high accuracy and nearly complete coverage. Here, we report the result of this finishing process. The current genome sequence (Build 35) contains 2.85 billion nucleotides interrupted by only 341 gaps. It covers ∼99% of the euchromatic genome and is accurate to an error rate of ∼1 event per 100,000 bases. Many of the remaining euchromatic gaps are associated with segmental duplications and will require focused work with new methods. The near-complete sequence, the first for a vertebrate, greatly improves the precision of biological analyses of the human genome including studies of gene number, birth and death. Notably, the human enome seems to encode only 20,000-25,000 protein-coding genes. The genome sequence reported here should serve as a firm foundation for biomedical research in the decades ahead

    DNA sequence of human chromosome 17 and analysis of rearrangement in the human lineage

    No full text
    Chromosome 17 is unusual among the human chromosomes in many respects. It is the largest human autosome with orthology to only a single mouse chromosome1, mapping entirely to the distal half of mouse chromosome 11. Chromosome 17 is rich in protein-coding genes, having the second highest gene density in the genome2,3. It is also enriched in segmental duplications, ranking third in density among the autosomes4. Here we report a finished sequence for human chromosome 17, as well as a structural comparison with the finished sequence for mouse chromosome 11, the first finished mouse chromosome. Comparison of the orthologous regions reveals striking differences. In contrast to the typical pattern seen in mammalian evolution5,6, the human sequence has undergone extensive intrachromosomal rearrangement, whereas the mouse sequence has been remarkably stable. Moreover, although the human sequence has a high density of segmental duplication, the mouse sequence has a very low density. Notably, these segmental duplications correspond closely to the sites of structural rearrangement, demonstrating a link between duplication and rearrangement. Examination of the main classes of duplicated segments provides insight into the dynamics underlying expansion of chromosome-specific, low-copy repeats in the human genome
    corecore