148 research outputs found

    QQ-SNV: single nucleotide variant detection at low frequency by comparing the quality quantiles

    Get PDF
    Background: Next generation sequencing enables studying heterogeneous populations of viral infections. When the sequencing is done at high coverage depth ("deep sequencing"), low frequency variants can be detected. Here we present QQ-SNV (http://sourceforge.net/projects/qqsnv), a logistic regression classifier model developed for the Illumina sequencing platforms that uses the quantiles of the quality scores, to distinguish true single nucleotide variants from sequencing errors based on the estimated SNV probability. To train the model, we created a dataset of an in silico mixture of five HIV-1 plasmids. Testing of our method in comparison to the existing methods LoFreq, ShoRAH, and V-Phaser 2 was performed on two HIV and four HCV plasmid mixture datasets and one influenza H1N1 clinical dataset. Results: For default application of QQ-SNV, variants were called using a SNV probability cutoff of 0.5 (QQ-SNVD). To improve the sensitivity we used a SNV probability cutoff of 0.0001 (QQ-SNVHS). To also increase specificity, SNVs called were overruled when their frequency was below the 80th percentile calculated on the distribution of error frequencies (QQ-SNVHS-P80). When comparing QQ-SNV versus the other methods on the plasmid mixture test sets, QQ-SNVD performed similarly to the existing approaches. QQ-SNVHS was more sensitive on all test sets but with more false positives. QQ-SNVHS-P80 was found to be the most accurate method over all test sets by balancing sensitivity and specificity. When applied to a paired-end HCV sequencing study, with lowest spiked-in true frequency of 0.5 %, QQ-SNVHS-P80 revealed a sensitivity of 100 % (vs. 40-60 % for the existing methods) and a specificity of 100 % (vs. 98.0-99.7 % for the existing methods). In addition, QQ-SNV required the least overall computation time to process the test sets. Finally, when testing on a clinical sample, four putative true variants with frequency below 0.5 % were consistently detected by QQ-SNVHS-P80 from different generations of Illumina sequencers. Conclusions: We developed and successfully evaluated a novel method, called QQ-SNV, for highly efficient single nucleotide variant calling on Illumina deep sequencing virology data

    ViVaMBC: estimating viral sequence variation in complex populations from illumina deep-sequencing data using model-based clustering

    Get PDF
    Background: Deep-sequencing allows for an in-depth characterization of sequence variation in complex populations. However, technology associated errors may impede a powerful assessment of low-frequency mutations. Fortunately, base calls are complemented with quality scores which are derived from a quadruplet of intensities, one channel for each nucleotide type for Illumina sequencing. The highest intensity of the four channels determines the base that is called. Mismatch bases can often be corrected by the second best base, i.e. the base with the second highest intensity in the quadruplet. A virus variant model-based clustering method, ViVaMBC, is presented that explores quality scores and second best base calls for identifying and quantifying viral variants. ViVaMBC is optimized to call variants at the codon level (nucleotide triplets) which enables immediate biological interpretation of the variants with respect to their antiviral drug responses. Results: Using mixtures of HCV plasmids we show that our method accurately estimates frequencies down to 0.5%. The estimates are unbiased when average coverages of 25,000 are reached. A comparison with the SNP-callers V-Phaser2, ShoRAH, and LoFreq shows that ViVaMBC has a superb sensitivity and specificity for variants with frequencies above 0.4%. Unlike the competitors, ViVaMBC reports a higher number of false-positive findings with frequencies below 0.4% which might partially originate from picking up artificial variants introduced by errors in the sample and library preparation step. Conclusions: ViVaMBC is the first method to call viral variants directly at the codon level. The strength of the approach lies in modeling the error probabilities based on the quality scores. Although the use of second best base calls appeared very promising in our data exploration phase, their utility was limited. They provided a slight increase in sensitivity, which however does not warrant the additional computational cost of running the offline base caller. Apparently a lot of information is already contained in the quality scores enabling the model based clustering procedure to adjust the majority of the sequencing errors. Overall the sensitivity of ViVaMBC is such that technical constraints like PCR errors start to form the bottleneck for low frequency variant detection

    HIV-1 V3 envelope deep sequencing for clinical plasma specimens failing in phenotypic tropism assays

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>HIV-1 infected patients for whom standard gp160 phenotypic tropism testing failed are currently excluded from co-receptor antagonist treatment. To provide patients with maximal treatment options, massively parallel sequencing of the envelope V3 domain, in combination with tropism prediction tools, was evaluated as an alternative tropism determination strategy. Plasma samples from twelve HIV-1 infected individuals with failing phenotyping results were available. The samples were submitted to massive parallel sequencing and to confirmatory recombinant phenotyping using a fraction of the gp120 domain.</p> <p>Results</p> <p>A cut-off for sequence reads interpretation of 5 to10 times the sequencing error rate (0.2%) was implemented. On average, each sample contained 7 different V3 haplotypes. V3 haplotypes were submitted to tropism prediction algorithms, and 4/14 samples returned with presence of a dual/mixed (D/M) tropic virus, respectively at 3%, 10%, 11%, and 95% of the viral quasispecies. V3 tropism prediction was confirmed by gp120 phenotyping, except for two out of 4 D/M predicted viruses (with 3 and 95%) which were phenotypically R5-tropic. In the first case, the result was discordant due to the limit of detection for the phenotyping technology, while in the latter case the prediction algorithms were not computing the viral tropism correctly.</p> <p>Conclusions</p> <p>Although only demonstrated on a limited set of samples, the potential of the combined use of "deep sequencing + prediction algorithms" in cases where routine gp160 phenotype testing cannot be employed was illustrated. While good concordance was observed between gp120 phenotyping and prediction of R5-tropic virus, the results suggest that accurate prediction of X4-tropic virus would require further algorithm development.</p

    Single-cell immune profiling reveals markers of emergency myelopoiesis that distinguish severe from mild respiratory syncytial virus disease in infants

    Get PDF
    Whereas most infants infected with respiratory syncytial virus (RSV) show no or only mild symptoms, an estimated 3 million children under five are hospitalized annually due to RSV disease. This study aimed to investigate biological mechanisms and associated biomarkers underlying RSV disease heterogeneity in young infants, enabling the potential to objectively categorize RSV-infected infants according to their medical needs. Immunophenotypic and functional profiling demonstrated the emergence of immature and progenitor-like neutrophils, proliferative monocytes (HLA-DRLow, Ki67+), impaired antigen-presenting function, downregulation of T cell response and low abundance of HLA-DRLow B cells in severe RSV disease. HLA-DRLow monocytes were found as a hallmark of RSV-infected infants requiring hospitalization. Complementary transcriptomics identified genes associated with disease severity and pointed to the emergency myelopoiesis response. These results shed new light on mechanisms underlying the pathogenesis and development of severe RSV disease and identified potential new candidate biomarkers for patient stratification

    Physical Activity Characteristics across GOLD Quadrants Depend on the Questionnaire Used

    Get PDF
    BACKGROUND:The GOLD multidimensional classification of COPD severity combines the exacerbation risk with the symptom experience, for which 3 different questionnaires are permitted. This study investigated differences in physical activity (PA) in the different GOLD quadrants and patient's distribution in relation to the questionnaire used. METHODS:136 COPD patients (58±21% FEV1 predicted, 34F/102M) completed COPD assessment test (CAT), clinical COPD questionnaire (CCQ) and modified Medical Research Council (mMRC) questionnaire. Exacerbation history, spirometry and 6MWD were collected. PA was objectively measured for 2 periods of 1 week, 6 months apart, in 5 European centres; to minimise seasonal and clinical variation the average of these two periods was used for analysis. RESULTS:GOLD quadrants C+D had reduced PA compared with A+B (3824 [2976] vs. 5508 [4671] steps.d-1, p<0.0001). The choice of questionnaire yielded different patient distributions (agreement mMRC-CAT κ = 0.57; CCQ-mMRC κ = 0.71; CCQ-CAT κ = 0.72) with different clinical characteristics. PA was notably lower in patients with an mMRC score ≥2 (3430 [2537] vs. 5443 [3776] steps.d-1, p <0.001) in both the low and high risk quadrants. CONCLUSIONS:Using different questionnaires changes the patient distribution and results in different clinical characteristics. Therefore, standardization of the questionnaire used for classification is critical to allow comparison of different studies using this as an entry criterion. CLINICAL TRIAL REGISTRATION:ClinicalTrials.gov NCT01388218

    Pre-hospital management protocols and perceived difficulty in diagnosing acute heart failure

    Get PDF
    Aim To illustrate the pre-hospital management arsenals and protocols in different EMS units, and to estimate the perceived difficulty of diagnosing suspected acute heart failure (AHF) compared with other common pre-hospital conditions. Methods and results A multinational survey included 104 emergency medical service (EMS) regions from 18 countries. Diagnostic and therapeutic arsenals related to AHF management were reported for each type of EMS unit. The prevalence and contents of management protocols for common medical conditions treated pre-hospitally was collected. The perceived difficulty of diagnosing AHF and other medical conditions by emergency medical dispatchers and EMS personnel was interrogated. Ultrasound devices and point-of-care testing were available in advanced life support and helicopter EMS units in fewer than 25% of EMS regions. AHF protocols were present in 80.8% of regions. Protocols for ST-elevation myocardial infarction, chest pain, and dyspnoea were present in 95.2, 80.8, and 76.0% of EMS regions, respectively. Protocolized diagnostic actions for AHF management included 12-lead electrocardiogram (92.1% of regions), ultrasound examination (16.0%), and point-of-care testings for troponin and BNP (6.0 and 3.5%). Therapeutic actions included supplementary oxygen (93.2%), non-invasive ventilation (80.7%), intravenous furosemide, opiates, nitroglycerine (69.0, 68.6, and 57.0%), and intubation 71.5%. Diagnosing suspected AHF was considered easy to moderate by EMS personnel and moderate to difficult by emergency medical dispatchers (without significant differences between de novo and decompensated heart failure). In both settings, diagnosis of suspected AHF was considered easier than pulmonary embolism and more difficult than ST-elevation myocardial infarction, asthma, and stroke. Conclusions The prevalence of AHF protocols is rather high but the contents seem to vary. Difficulty of diagnosing suspected AHF seems to be moderate compared with other pre-hospital conditions

    Mechanical Impedance and Its Relations to Motor Control, Limb Dynamics, and Motion Biomechanics

    Get PDF

    Quasispecies Analysis of JC Virus DNA Present in Urine of Healthy Subjects

    No full text
    JC virus is a human polyomavirus that infects the majority of people without apparent symptoms in healthy subjects and it is the causative agent of progressive multifocal leucoencephalopathy (PML), a disorder following lytic infection of oligodendrocytes that mainly manifests itself under immunosuppressive conditions. A hallmark for JC virus isolated from PML-brain is the presence of rearrangements in the non-coding control region (NCCR) interspersed between the early and late genes on the viral genome. Such rearrangements are believed to originate from the archetype JC virus which is shed in urine by healthy subjects and PML patients. We applied next generation sequencing to explore the non-coding control region variability in urine of healthy subjects in search for JC virus quasispecies and rearrangements reminiscent of PML. For 61 viral shedders (out of a total of 254 healthy subjects) non-coding control region DNA and VP1 (major capsid protein) coding sequences were initially obtained by Sanger sequencing. Deletions between 1 and 28 nucleotides long appeared in ∼24.5% of the NCCR sequences while insertions were only detected in ∼3.3% of the samples. 454 pyrosequencing was applied on a subset of 54 urine samples demonstrating the existence of JC virus quasispecies in four subjects (∼7.4%). Hence, our results indicate that JC virus DNA in urine is not always restricted to one unique virus variant, but can be a mixture of naturally occurring variants (quasispecies) reflecting the susceptibility of the non-coding control region for genomic rearrangements in healthy individuals. Our findings pave the way to explore the presence of viral quasispecies and the altered viral tropism that might go along with it as a potential risk factor for opportunistic secondary infections such as PML.status: publishe

    ViVaMBC: Estimating viral sequence variation in complex populations from illumina deep-sequencing data using model-based clustering

    Get PDF
    Background: Deep-sequencing allows for an in-depth characterization of sequence variation in complex populations. However, technology associated errors may impede a powerful assessment of low-frequency mutations. Fortunately, base calls are complemented with quality scores which are derived from a quadruplet of intensities, one channel for each nucleotide type for Illumina sequencing. The highest intensity of the four channels determines the base that is called. Mismatch bases can often be corrected by the second best base, i.e. the base with the second highest intensity in the quadruplet. A virus variant model-based clustering method, ViVaMBC, is presented that explores quality scores and second best base calls for identifying and quantifying viral variants. ViVaMBC is optimized to call variants at the codon level (nucleotide triplets) which enables immediate biological interpretation of the variants with respect to their antiviral drug responses. Results: Using mixtures of HCV plasmids we show that our method accurately estimates frequencies down to 0.5%. The estimates are unbiased when average coverages of 25,000 are reached. A comparison with the SNP-callers V-Phaser2, ShoRAH, and LoFreq shows that ViVaMBC has a superb sensitivity and specificity for variants with frequencies above 0.4%. Unlike the competitors, ViVaMBC reports a higher number of false-positive findings with frequencies below 0.4% which might partially originate from picking up artificial variants introduced by errors in the sample and library preparation step. Conclusions: ViVaMBC is the first method to call viral variants directly at the codon level. The strength of the approach lies in modeling the error probabilities based on the quality scores. Although the use of second best base calls appeared very promising in our data exploration phase, their utility was limited. They provided a slight increase in sensitivity, which however does not warrant the additional computational cost of running the offline base caller. Apparently a lot of information is already contained in the quality scores enabling the model based clustering procedure to adjust the majority of the sequencing errors. Overall the sensitivity of ViVaMBC is such that technical constraints like PCR errors start to form the bottleneck for low frequency variant detection
    corecore