64 research outputs found

    Genetic Variation in an Individual Human Exome

    Get PDF
    There is much interest in characterizing the variation in a human individual, because this may elucidate what contributes significantly to a person's phenotype, thereby enabling personalized genomics. We focus here on the variants in a person's ‘exome,’ which is the set of exons in a genome, because the exome is believed to harbor much of the functional variation. We provide an analysis of the ∼12,500 variants that affect the protein coding portion of an individual's genome. We identified ∼10,400 nonsynonymous single nucleotide polymorphisms (nsSNPs) in this individual, of which ∼15–20% are rare in the human population. We predict ∼1,500 nsSNPs affect protein function and these tend be heterozygous, rare, or novel. Of the ∼700 coding indels, approximately half tend to have lengths that are a multiple of three, which causes insertions/deletions of amino acids in the corresponding protein, rather than introducing frameshifts. Coding indels also occur frequently at the termini of genes, so even if an indel causes a frameshift, an alternative start or stop site in the gene can still be used to make a functional protein. In summary, we reduced the set of ∼12,500 nonsilent coding variants by ∼8-fold to a set of variants that are most likely to have major effects on their proteins' functions. This is our first glimpse of an individual's exome and a snapshot of the current state of personalized genomics. The majority of coding variants in this individual are common and appear to be functionally neutral. Our results also indicate that some variants can be used to improve the current NCBI human reference genome. As more genomes are sequenced, many rare variants and non-SNP variants will be discovered. We present an approach to analyze the coding variation in humans by proposing multiple bioinformatic methods to hone in on possible functional variation

    Nanoliter Reactors Improve Multiple Displacement Amplification of Genomes from Single Cells

    Get PDF
    Since only a small fraction of environmental bacteria are amenable to laboratory culture, there is great interest in genomic sequencing directly from single cells. Sufficient DNA for sequencing can be obtained from one cell by the Multiple Displacement Amplification (MDA) method, thereby eliminating the need to develop culture methods. Here we used a microfluidic device to isolate individual Escherichia coli and amplify genomic DNA by MDA in 60-nl reactions. Our results confirm a report that reduced MDA reaction volume lowers nonspecific synthesis that can result from contaminant DNA templates and unfavourable interaction between primers. The quality of the genome amplification was assessed by qPCR and compared favourably to single-cell amplifications performed in standard 50-μl volumes. Amplification bias was greatly reduced in nanoliter volumes, thereby providing a more even representation of all sequences. Single-cell amplicons from both microliter and nanoliter volumes provided high-quality sequence data by high-throughput pyrosequencing, thereby demonstrating a straightforward route to sequencing genomes from single cells

    Genomic insights into the Ixodes scapularis tick vector of Lyme disease

    Get PDF
    Citation: Gulia-Nuss, M., Nuss, A. B., Meyer, J. M., Sonenshine, D. E., Roe, R. M., Waterhouse, R. M., . . . Hill, C. A. (2016). Genomic insights into the Ixodes scapularis tick vector of Lyme disease. Nature Communications, 7, 13. doi:10.1038/ncomms10507Additional Authors: Koren, S.;Hostetler, J. B.;Thiagarajan, M.;Joardar, V. S.;Hannick, L. I.;Bidwell, S.;Hammond, M. P.;Young, S.;Zeng, Q. D.;Abrudan, J. L.;Almeida, F. C.;Ayllon, N.;Bhide, K.;Bissinger, B. W.;Bonzon-Kulichenko, E.;Buckingham, S. D.;Caffrey, D. R.;Caimano, M. J.;Croset, V.;Driscoll, T.;Gilbert, D.;Gillespie, J. J.;Giraldo-Calderon, G. I.;Grabowski, J. M.;Jiang, D.;Khalil, S. M. S.;Kim, D.;Kocan, K. M.;Koci, J.;Kuhn, R. J.;Kurtti, T. J.;Lees, K.;Lang, E. G.;Kennedy, R. C.;Kwon, H.;Perera, R.;Qi, Y. M.;Radolf, J. D.;Sakamoto, J. M.;Sanchez-Gracia, A.;Severo, M. S.;Silverman, N.;Simo, L.;Tojo, M.;Tornador, C.;Van Zee, J. P.;Vazquez, J.;Vieira, F. G.;Villar, M.;Wespiser, A. R.;Yang, Y. L.;Zhu, J. W.;Arensburger, P.;Pietrantonio, P. V.;Barker, S. C.;Shao, R. F.;Zdobnov, E. M.;Hauser, F.;Grimmelikhuijzen, C. J. P.;Park, Y.;Rozas, J.;Benton, R.;Pedra, J. H. F.;Nelson, D. R.;Unger, M. F.;Tubio, J. M. C.;Tu, Z. J.;Robertson, H. M.;Shumway, M.;Sutton, G.;Wortman, J. R.;Lawson, D.;Wikel, S. K.;Nene, V. M.;Fraser, C. M.;Collins, F. H.;Birren, B.;Nelson, K. E.;Caler, E.;Hill, C. A.Ticks transmit more pathogens to humans and animals than any other arthropod. We describe the 2.1 Gbp nuclear genome of the tick, Ixodes scapularis (Say), which vectors pathogens that cause Lyme disease, human granulocytic anaplasmosis, babesiosis and other diseases. The large genome reflects accumulation of repetitive DNA, new lineages of retro-transposons, and gene architecture patterns resembling ancient metazoans rather than pancrustaceans. Annotation of scaffolds representing similar to 57% of the genome, reveals 20,486 protein-coding genes and expansions of gene families associated with tick-host interactions. We report insights from genome analyses into parasitic processes unique to ticks, including host 'questing', prolonged feeding, cuticle synthesis, blood meal concentration, novel methods of haemoglobin digestion, haem detoxification, vitellogenesis and prolonged off-host survival. We identify proteins associated with the agent of human granulocytic anaplasmosis, an emerging disease, and the encephalitis-causing Langat virus, and a population structure correlated to life-history traits and transmission of the Lyme disease agent

    Hybrid assembly with long and short reads improves discovery of gene family expansions

    Get PDF
    BACKGROUND: Long-read and short-read sequencing technologies offer competing advantages for eukaryotic genome sequencing projects. Combinations of both may be appropriate for surveys of within-species genomic variation. METHODS: We developed a hybrid assembly pipeline called "Alpaca" that can operate on 20X long-read coverage plus about 50X short-insert and 50X long-insert short-read coverage. To preclude collapse of tandem repeats, Alpaca relies on base-call-corrected long reads for contig formation. RESULTS: Compared to two other assembly protocols, Alpaca demonstrated the most reference agreement and repeat capture on the rice genome. On three accessions of the model legume Medicago truncatula, Alpaca generated the most agreement to a conspecific reference and predicted tandemly repeated genes absent from the other assemblies. CONCLUSION: Our results suggest Alpaca is a useful tool for investigating structural and copy number variation within de novo assemblies of sampled populations

    The Diploid Genome Sequence of an Individual Human

    Get PDF
    Presented here is a genome sequence of an individual human. It was produced from ∼32 million random DNA fragments, sequenced by Sanger dideoxy technology and assembled into 4,528 scaffolds, comprising 2,810 million bases (Mb) of contiguous sequence with approximately 7.5-fold coverage for any given region. We developed a modified version of the Celera assembler to facilitate the identification and comparison of alternate alleles within this individual diploid genome. Comparison of this genome and the National Center for Biotechnology Information human reference assembly revealed more than 4.1 million DNA variants, encompassing 12.3 Mb. These variants (of which 1,288,319 were novel) included 3,213,401 single nucleotide polymorphisms (SNPs), 53,823 block substitutions (2–206 bp), 292,102 heterozygous insertion/deletion events (indels)(1–571 bp), 559,473 homozygous indels (1–82,711 bp), 90 inversions, as well as numerous segmental duplications and copy number variation regions. Non-SNP DNA variation accounts for 22% of all events identified in the donor, however they involve 74% of all variant bases. This suggests an important role for non-SNP genetic alterations in defining the diploid genome structure. Moreover, 44% of genes were heterozygous for one or more variants. Using a novel haplotype assembly strategy, we were able to span 1.5 Gb of genome sequence in segments >200 kb, providing further precision to the diploid nature of the genome. These data depict a definitive molecular portrait of a diploid human genome that provides a starting point for future genome comparisons and enables an era of individualized genomic information

    Genomic innovations, transcriptional plasticity and gene loss underlying the evolution and divergence of two highly polyphagous and invasive <i>Helicoverpa</i> pest species

    Get PDF
    BACKGROUND: Helicoverpa armigera and Helicoverpa zea are major caterpillar pests of Old and New World agriculture, respectively. Both, particularly H. armigera, are extremely polyphagous, and H. armigera has developed resistance to many insecticides. Here we use comparative genomics, transcriptomics and resequencing to elucidate the genetic basis for their properties as pests. RESULTS: We find that, prior to their divergence about 1.5 Mya, the H. armigera/H. zea lineage had accumulated up to more than 100 more members of specific detoxification and digestion gene families and more than 100 extra gustatory receptor genes, compared to other lepidopterans with narrower host ranges. The two genomes remain very similar in gene content and order, but H. armigera is more polymorphic overall, and H. zea has lost several detoxification genes, as well as about 50 gustatory receptor genes. It also lacks certain genes and alleles conferring insecticide resistance found in H. armigera. Non-synonymous sites in the expanded gene families above are rapidly diverging, both between paralogues and between orthologues in the two species. Whole genome transcriptomic analyses of H. armigera larvae show widely divergent responses to different host plants, including responses among many of the duplicated detoxification and digestion genes. CONCLUSIONS: The extreme polyphagy of the two heliothines is associated with extensive amplification and neofunctionalisation of genes involved in host finding and use, coupled with versatile transcriptional responses on different hosts. H. armigera's invasion of the Americas in recent years means that hybridisation could generate populations that are both locally adapted and insecticide resistant

    Genomic Insights Into The Ixodes scapularis Tick Vector Of Lyme Disease

    Get PDF
    Ticks transmit more pathogens to humans and animals than any other arthropod. We describe the 2.1 Gbp nuclear genome of the tick, Ixodes scapularis (Say), which vectors pathogens that cause Lyme disease, human granulocytic anaplasmosis, babesiosis and other diseases. The large genome reflects accumulation of repetitive DNA, new lineages of retrotransposons, and gene architecture patterns resembling ancient metazoans rather than pancrustaceans. Annotation of scaffolds representing B57% of the genome, reveals 20,486 protein-coding genes and expansions of gene families associated with tick–host interactions. We report insights from genome analyses into parasitic processes unique to ticks, including host ‘questing’, prolonged feeding, cuticle synthesis, blood meal concentration, novel methods of haemoglobin digestion, haem detoxification, vitellogenesis and prolonged off-host survival. We identify proteins associated with the agent of human granulocytic anaplasmosis, an emerging disease, and the encephalitis-causing Langat virus, and a population structure correlated to life-history traits and transmission of the Lyme disease agent

    Genomic Insights Into The Ixodes scapularis Tick Vector Of Lyme Disease

    Get PDF
    Ticks transmit more pathogens to humans and animals than any other arthropod. We describe the 2.1 Gbp nuclear genome of the tick, Ixodes scapularis (Say), which vectors pathogens that cause Lyme disease, human granulocytic anaplasmosis, babesiosis and other diseases. The large genome reflects accumulation of repetitive DNA, new lineages of retrotransposons, and gene architecture patterns resembling ancient metazoans rather than pancrustaceans. Annotation of scaffolds representing B57% of the genome, reveals 20,486 protein-coding genes and expansions of gene families associated with tick–host interactions. We report insights from genome analyses into parasitic processes unique to ticks, including host ‘questing’, prolonged feeding, cuticle synthesis, blood meal concentration, novel methods of haemoglobin digestion, haem detoxification, vitellogenesis and prolonged off-host survival. We identify proteins associated with the agent of human granulocytic anaplasmosis, an emerging disease, and the encephalitis-causing Langat virus, and a population structure correlated to life-history traits and transmission of the Lyme disease agent
    corecore