Search CORE

22 research outputs found

Highly Sensitive and Specific Detection of Rare Variants in Mixed Viral Populations from Massively Parallel Sequence Data

Author: Alexander R. Macalalad
Bruce W. Birren
C Hedskog
C Hoffmann
C Quince
C Wang
C Wang
Christian L. Boutwell
Christine M. Malboeuf
D Altshuler
Doug E. Brackney
DR Bentley
Elizabeth M. Ryan
G Rozera
Gregory D. Ebel
H Li
HF Gunthard
I Astrovskaya
J Archer
JF Salazar-Gonzalez
Joshua Z. Levin
Karen A. Power
Kendra N. Pesko
LQ Zhang
M Margulies
MA DePristo
Matthew R. Henn
MCF Prosperi
Michael C. Zody
MR Henn
N Eriksson
N Malhis
Niall J. Lennon
O Zagordi
O Zagordi
O Zagordi
Patrick Charlebois
R Goya
R Li
Ruchi M. Newman
S Palmer
Sergei L. Kosakovsky Pond
T Zhu
Todd M. Allen
VF Boltz
W Brockman
Publication venue: Public Library of Science
Publication date: 15/03/2012
Field of study

Viruses diversify over time within hosts, often undercutting the effectiveness of host defenses and therapeutic interventions. To design successful vaccines and therapeutics, it is critical to better understand viral diversification, including comprehensively characterizing the genetic variants in viral intra-host populations and modeling changes from transmission through the course of infection. Massively parallel sequencing technologies can overcome the cost constraints of older sequencing methods and obtain the high sequence coverage needed to detect rare genetic variants (<1%) within an infected host, and to assay variants without prior knowledge. Critical to interpreting deep sequence data sets is the ability to distinguish biological variants from process errors with high sensitivity and specificity. To address this challenge, we describe V-Phaser, an algorithm able to recognize rare biological variants in mixed populations. V-Phaser uses covariation (i.e. phasing) between observed variants to increase sensitivity and an expectation maximization algorithm that iteratively recalibrates base quality scores to increase specificity. Overall, V-Phaser achieved >97% sensitivity and >97% specificity on control read sets. On data derived from a patient after four years of HIV-1 infection, V-Phaser detected 2,015 variants across the ∼10 kb genome, including 603 rare variants (<1% frequency) detected only using phase information. V-Phaser identified variants at frequencies down to 0.2%, comparable to the detection threshold of allele-specific PCR, a method that requires prior knowledge of the variants. The high sensitivity and specificity of V-Phaser enables identifying and tracking changes in low frequency variants in mixed populations such as RNA viruses

Public Library of Science (PLOS)

CiteSeerX

Crossref

Harvard University - DASH

Directory of Open Access Journals

PubMed Central

Whole Genome Deep Sequencing of HIV-1 Reveals the Impact of Early Minor Variants Upon Immune Recognition During Acute Infection

Author: Allen Todd M.
Altfeld Marcus
Axten Karen L.
Battis Laura
Bazner Suzane
Berical Andrew
Berlin Aaron M.
Birren Bruce W.
Bloom Allyson K.
Boutwell Christian L.
Brander Christian
Brumme Chanson J.
Brumme Zabrina L.
Casali Monica
Charlebois Patrick
Dudek Tim
Erlich Rachel L.
Gasser Olivier
Gladden Adrianne D.
Gnerre Sante
Green Lisa M.
Gujja Sharvari
Günthard Huldrych F.
Henn Matthew R.
Hess Christoph
Jessen Heiko
Kemper Michael
Lennon Niall J.
Levin Joshua Z.
Macalalad Alexander R.
Malboeuf Christine M.
Mayer Ken H.
Newman Ruchi
Pereyra Florencia
Power Karen A.
Rosenberg Eric
Ryan Elizabeth M.
Rychert Jenna
Shea Terrance P.
Streeck Hendrik
Tinsley Jake P.
Tully Damien
Walker Bruce D.
Wang Yaoyu
Young Sarah K.
Zedlack Carmen
Zeng Qiandong
Zody Michael C.
Publication venue: Public Library of Science
Publication date: 01/01/2012
Field of study

Deep sequencing technologies have the potential to transform the study of highly variable viral pathogens by providing a rapid and cost-effective approach to sensitively characterize rapidly evolving viral quasispecies. Here, we report on a high-throughput whole HIV-1 genome deep sequencing platform that combines 454 pyrosequencing with novel assembly and variant detection algorithms. In one subject we combined these genetic data with detailed immunological analyses to comprehensively evaluate viral evolution and immune escape during the acute phase of HIV-1 infection. The majority of early, low frequency mutations represented viral adaptation to host CD8+ T cell responses, evidence of strong immune selection pressure occurring during the early decline from peak viremia. CD8+ T cell responses capable of recognizing these low frequency escape variants coincided with the selection and evolution of more effective secondary HLA-anchor escape mutations. Frequent, and in some cases rapid, reversion of transmitted mutations was also observed across the viral genome. When located within restricted CD8 epitopes these low frequency reverting mutations were sufficient to prime de novo responses to these epitopes, again illustrating the capacity of the immune response to recognize and respond to low frequency variants. More importantly, rapid viral escape from the most immunodominant CD8+ T cell responses coincided with plateauing of the initial viral load decline in this subject, suggestive of a potential link between maintenance of effective, dominant CD8 responses and the degree of early viremia reduction. We conclude that the early control of HIV-1 replication by immunodominant CD8+ T cell responses may be substantially influenced by rapid, low frequency viral adaptations not detected by conventional sequencing approaches, which warrants further investigation. These data support the critical need for vaccine-induced CD8+ T cell responses to target more highly constrained regions of the virus in order to ensure the maintenance of immunodominant CD8 responses and the sustained decline of early viremia

CiteSeerX

Public Library of Science (PLOS)

Crossref

Harvard University - DASH

edoc

Directory of Open Access Journals

PubMed Central

ZORA

FigShare

Phase information increased sensitivity, and base quality scores increased specificity.

Author: Alexander R. Macalalad (177162)
Bruce W. Birren (147656)
Christian L. Boutwell (177170)
Christine M. Malboeuf (177167)
Doug E. Brackney (177174)
Elizabeth M. Ryan (177168)
Gregory D. Ebel (177185)
Joshua Z. Levin (177182)
Karen A. Power (177172)
Kendra N. Pesko (177178)
Matthew R. Henn (103220)
Michael C. Zody (155402)
Niall J. Lennon (177164)
Patrick Charlebois (177163)
Ruchi M. Newman (177165)
Todd M. Allen (177189)
Publication venue
Publication date
Field of study

We compared V-Phaser to alternate versions of V-Phaser with specific components disabled. In the No Phase version, V-Phaser called variants without phase information. In the Uniform Errors version, V-Phaser estimated uniform error rates within homopolymer and nonhomopolymer regions without regard to assigned base qualities. In the No Filtering version, V-Phaser did not filter out low quality bases. (A) Phase information increased sensitivity. The version without phase information attained a sensitivity of 90%, but all other versions of V-Phaser used phase information and attained a sensitivity of 97% or more. We calculated sensitivity as the percentage of known variants correctly identified. Data are from WNV mixed population control dataset. (B) Individual base quality scores increased specificity. Among loci with mismatches, the Uniform Errors version had only 91% specificity, but all other versions incorporated base quality scores in their probability model and attained 97% specificity or more. We calculated specificity as the percentage of loci in the control sample correctly identified as having no variants among loci that had at least one candidate variant. Data are from infectious clone (HIV NL4-3) control dataset.</p

FigShare

Error rates were not uniformly distributed.

Author: Alexander R. Macalalad (177162)
Bruce W. Birren (147656)
Christian L. Boutwell (177170)
Christine M. Malboeuf (177167)
Doug E. Brackney (177174)
Elizabeth M. Ryan (177168)
Gregory D. Ebel (177185)
Joshua Z. Levin (177182)
Karen A. Power (177172)
Kendra N. Pesko (177178)
Matthew R. Henn (103220)
Michael C. Zody (155402)
Niall J. Lennon (177164)
Patrick Charlebois (177163)
Ruchi M. Newman (177165)
Todd M. Allen (177189)
Publication venue
Publication date
Field of study

Error rates varied by (A) read position, (B) base transition, and (C) base quality score. We counted as errors any mismatches to the consensus assembly for each of the two runs in the control read set under the assumption that the NL-43 infectious clone had no diversity. We defined the read position relative to the beginning or end of the read, whichever was closer. We defined a base transition as a dinucleotide representing the transition from the preceding base to the current base, and we scored a transition as an error if the current base was a mismatch. Base quality scores came from the sequencing process.</p

FigShare

NQS filtering improves fit of probability model to data.

Author: Alexander R. Macalalad (177162)
Bruce W. Birren (147656)
Christian L. Boutwell (177170)
Christine M. Malboeuf (177167)
Doug E. Brackney (177174)
Elizabeth M. Ryan (177168)
Gregory D. Ebel (177185)
Joshua Z. Levin (177182)
Karen A. Power (177172)
Kendra N. Pesko (177178)
Matthew R. Henn (103220)
Michael C. Zody (155402)
Niall J. Lennon (177164)
Patrick Charlebois (177163)
Ruchi M. Newman (177165)
Todd M. Allen (177189)
Publication venue
Publication date
Field of study

(A) Quantile-quantile (q-q) plots under NQS filtering show good fit of the probability model to the observed distribution of errors. Since the probability model is discrete, p values are projected onto a uniform distribution, and the distribution of projected p values is compared with the expected null distribution. See <a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1002417#s4" target="_blank">Materials and Methods</a> section for details. (B) In contrast, q-q plots under no filtering show that no filtering skews the calibration of the probability model used by V-Phaser. Q-q plots of models based on subsets of the reads demonstrate that this effect becomes more pronounced with increasing coverage (see <a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1002417#pcbi.1002417.s001" target="_blank">Figure S1</a>). Q-q plots are scaled to fit curve, so y = x line is not at a 45 degree angle.</p

FigShare

Phase information increased sensitivity to detect minor variants.

Author: Alexander R. Macalalad (177162)
Bruce W. Birren (147656)
Christian L. Boutwell (177170)
Christine M. Malboeuf (177167)
Doug E. Brackney (177174)
Elizabeth M. Ryan (177168)
Gregory D. Ebel (177185)
Joshua Z. Levin (177182)
Karen A. Power (177172)
Kendra N. Pesko (177178)
Matthew R. Henn (103220)
Michael C. Zody (155402)
Niall J. Lennon (177164)
Patrick Charlebois (177163)
Ruchi M. Newman (177165)
Todd M. Allen (177189)
Publication venue
Publication date
Field of study

Phase information increased sensitivity to detect low frequency variants, as shown by these histograms of variants under 2.5%. All versions of V-Phaser detected 100% of the variants above 2.5% frequency, so these variants are not shown here. All versions of V-Phaser with phase information (A), (C), and (D) detected most variants below 1% in frequency, but the No Phase version (B) missed many variants below 1% and some variants as high as 2.5%. Data are from control WNV mixed population.</p

FigShare

Comparison of V-Phaser to other viral variant callers.

Author: Alexander R. Macalalad (177162)
Bruce W. Birren (147656)
Christian L. Boutwell (177170)
Christine M. Malboeuf (177167)
Doug E. Brackney (177174)
Elizabeth M. Ryan (177168)
Gregory D. Ebel (177185)
Joshua Z. Levin (177182)
Karen A. Power (177172)
Kendra N. Pesko (177178)
Matthew R. Henn (103220)
Michael C. Zody (155402)
Niall J. Lennon (177164)
Patrick Charlebois (177163)
Ruchi M. Newman (177165)
Todd M. Allen (177189)
Publication venue
Publication date
Field of study

Sensitivities and specificities reported across residues interrogated by all programs. Sensitivity is measured as the fraction of the known variants found by each program in the WNV mixed population control data set. Specificity is the fraction of sites not containing known variants that were called as invariant in the HIV NL4-3 control data set; values reported in parentheses include inserted and deleted bases (see <a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1002417#s4" target="_blank">Materials and Methods</a>).</p

FigShare

Phase increased sensitivity to detect variants.

Author: Alexander R. Macalalad (177162)
Bruce W. Birren (147656)
Christian L. Boutwell (177170)
Christine M. Malboeuf (177167)
Doug E. Brackney (177174)
Elizabeth M. Ryan (177168)
Gregory D. Ebel (177185)
Joshua Z. Levin (177182)
Karen A. Power (177172)
Kendra N. Pesko (177178)
Matthew R. Henn (103220)
Michael C. Zody (155402)
Niall J. Lennon (177164)
Patrick Charlebois (177163)
Ruchi M. Newman (177165)
Todd M. Allen (177189)
Publication venue
Publication date
Field of study

Phase increased sensitivity to detect variants, as seen over a range of error rates at coverages of 100-fold, 250-fold, and 500-fold. The phased variant detection threshold frequency (VDTF) is the lowest frequency of reads with variants at two specific loci that V-Phaser can distinguish from error among reads that span both loci. The unphased VDTF is the lowest frequency of one variant that V-Phaser can distinguish from error among reads that cover that locus. 100-fold phased sequence coverage achieves comparable detection thresholds as 500-fold unphased. We use Equation 7 to calculate the phased and unphased VDTFs. (See the <a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1002417#s4" target="_blank">Materials and Methods</a> section for Equation 7 and its derivation.)</p

FigShare

Complete viral RNA genome sequencing of ultra-low copy samples by sequence-independent amplification

Author: Aaron M. Berlin
Adachi
Adey
Archer
Beane
Bimber
Cheval
Christian L. Boutwell
Christine M. Malboeuf
Christopherson
Cordey
Deeks
Devincenzo
Djikeng
Dong
Eckerle
Edgar
Fischer
Gregory D. Ebel
Gu
Harrison
Henn
Hofacker
Hoffmann
Huson
James Qu
John P. DeVincenzo
Joshua Z. Levin
Kauffman
Kendra N. Pesko
Kurn
Lauck
Levin
Li
Marine
Matthew R. Henn
Michael C. Zody
Miura
Monica Casali
Moore
Morin
Ninomiya
Palmer
Parameswaran
Patrick Charlebois
Perkins
Pesko
Salazar-Gonzalez
Shi
Simen
Tariq
ten Bosch
Todd M. Allen
Tricou
Tsibris
Victoria
Wang
Wang
Watts
Willerth
Wright
Xiao Yang
Zhang
Publication venue: 'Oxford University Press (OUP)'
Publication date
Field of study

Crossref