20 research outputs found

    Effort required to finish shotgun-generated genome sequences differs significantly among vertebrates

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The approaches for shotgun-based sequencing of vertebrate genomes are now well-established, and have resulted in the generation of numerous draft whole-genome sequence assemblies. In contrast, the process of refining those assemblies to improve contiguity and increase accuracy (known as 'sequence finishing') remains tedious, labor-intensive, and expensive. As a result, the vast majority of vertebrate genome sequences generated to date remain at a draft stage.</p> <p>Results</p> <p>To date, our genome sequencing efforts have focused on comparative studies of targeted genomic regions, requiring sequence finishing of large blocks of orthologous sequence (average size 0.5-2 Mb) from various subsets of 75 vertebrates. This experience has provided a unique opportunity to compare the relative effort required to finish shotgun-generated genome sequence assemblies from different species, which we report here. Importantly, we found that the sequence assemblies generated for the same orthologous regions from various vertebrates show substantial variation with respect to misassemblies and, in particular, the frequency and characteristics of sequence gaps. As a consequence, the work required to finish different species' sequences varied greatly. Application of the same standardized methods for finishing provided a novel opportunity to "assay" characteristics of genome sequences among many vertebrate species. It is important to note that many of the problems we have encountered during sequence finishing reflect unique architectural features of a particular vertebrate's genome, which in some cases may have important functional and/or evolutionary implications. Finally, based on our analyses, we have been able to improve our procedures to overcome some of these problems and to increase the overall efficiency of the sequence-finishing process, although significant challenges still remain.</p> <p>Conclusion</p> <p>Our findings have important implications for the eventual finishing of the draft whole-genome sequences that have now been generated for a large number of vertebrates.</p

    Co-evolution of a broadly neutralizing HIV-1 antibody and founder virus

    Get PDF
    Current HIV-1 vaccines elicit strain-specific neutralizing antibodies. However, cross-reactive neutralizing antibodies arise in ~20% of HIV-1-infected individuals, and details of their generation could provide a roadmap for effective vaccination. Here we report the isolation, evolution and structure of a broadly neutralizing antibody from an African donor followed from time of infection. The mature antibody, CH103, neutralized ~55% of HIV-1 isolates, and its co-crystal structure with gp120 revealed a novel loop-based mechanism of CD4-binding site recognition. Virus and antibody gene sequencing revealed concomitant virus evolution and antibody maturation. Notably, the CH103-lineage unmutated common ancestor avidly bound the transmitted/founder HIV-1 envelope glycoprotein, and evolution of antibody neutralization breadth was preceded by extensive viral diversification in and near the CH103 epitope. These data elucidate the viral and antibody evolution leading to induction of a lineage of HIV-1 broadly neutralizing antibodies and provide insights into strategies to elicit similar antibodies via vaccination

    Gene-Based Sequencing Identifies Lipid-Influencing Variants with Ethnicity-Specific Effects in African Americans

    No full text
    <div><p>Although a considerable proportion of serum lipids loci identified in European ancestry individuals (EA) replicate in African Americans (AA), interethnic differences in the distribution of serum lipids suggest that some genetic determinants differ by ethnicity. We conducted a comprehensive evaluation of five lipid candidate genes to identify variants with ethnicity-specific effects. We sequenced <i>ABCA1</i>, <i>LCAT</i>, <i>LPL</i>, <i>PON1</i>, and <i>SERPINE1</i> in 48 AA individuals with extreme serum lipid concentrations (high HDLC/low TG or low HDLC/high TG). Identified variants were genotyped in the full population-based sample of AA (n = 1694) and tested for an association with serum lipids. rs328 (<i>LPL</i>) and correlated variants were associated with higher HDLC and lower TG. Interestingly, a stronger effect was observed on a “European” vs. “African” genetic background at this locus. To investigate this effect, we evaluated the region among West Africans (WA). For TG, the effect size among WA was the same in AA with only African local ancestry (2–3% lower TG), while the larger association among AA with local European ancestry matched previous reports in EA (10%). For HDLC, there was no association with rs328 in AA with only African local ancestry or in WA, while the association among AA with European local ancestry was much greater than what has been observed for EA (15 vs. ∼5 mg/dl), suggesting an interaction with an environmental or genetic factor that differs by ethnicity. Beyond this ancestry effect, the importance of African ancestry-focused, sequence-based work was also highlighted by serum lipid associations of variants that were in higher frequency (or present only) among those of African ancestry. By beginning our study with the sequence variation present in AA individuals, investigating local ancestry effects, and seeking replication in WA, we were able to comprehensively evaluate the role of a set of candidate genes in serum lipids in AA.</p></div

    Targeted resequencing implicates the familial Mediterranean fever gene MEFV and the toll-like receptor 4 gene TLR4 in Behcet disease

    No full text
    Genome-wide association studies (GWAS) are a powerful means of identifying genes with disease-associated common variants, but they are not well-suited to detecting genes with disease-associated rare and low-frequency variants. In the current study of Behcet disease (BD), nonsynonymous variants (NSVs) identified by deep exonic resequencing of 10 genes found by GWAS (IL10, IL23R, CCR1, STAT4, KLRK1, KLRC1, KLRC2, KLRC3, KLRC4, and ERAP1) and 11 genes selected for their role in innate immunity (IL1B, IL1R1, IL1RN, NLRP3, MEFV, TNFRSF1A, PSTPIP1, CASP1, PYCARD, NOD2, and TLR4) were evaluated for BD association. A differential distribution of the rare and low-frequency NSVs of a gene in 2,461 BD cases compared with 2,458 controls indicated their collective association with disease. By stringent criteria requiring at least a single burden test with study-wide significance and a corroborating test with at least nominal significance, rare and low-frequency NSVs in one GWAS-identified gene, IL23R (P = 6.9 x 10(-5)), and one gene involved in innate immunity, TLR4 (P = 8.0 x 10(-4)), were associated with BD. In addition, damaging or rare damaging NOD2 variants were nominally significant across all three burden tests applied (P = 0.0063-0.045). Furthermore, carriage of the familial Mediterranean fever gene (MEFV) mutation Met694Val, which is known to cause recessively inherited familial Mediterranean fever, conferred BD risk in the Turkish population (OR, 2.65; P = 1.8 x 10(-12)). The disease-associated NSVs in MEFV and TLR4 implicate innate immune and bacterial sensing mechanisms in BD pathogenesis
    corecore