140 research outputs found
Can Linear Regression Modeling Help Clinicians in the Interpretation of Genotypic Resistance Data? An Application to Derive a Lopinavir-Score
The question of whether a score for a specific antiretroviral (e.g. lopinavir/r in this analysis) that improves prediction of viral load response given by existing expert-based interpretation systems (IS) could be derived from analyzing the correlation between genotypic data and virological response using statistical methods remains largely unanswered.We used the data of the patients from the UK Collaborative HIV Cohort (UK CHIC) Study for whom genotypic data were stored in the UK HIV Drug Resistance Database (UK HDRD) to construct a training/validation dataset of treatment change episodes (TCE). We used the average square error (ASE) on a 10-fold cross-validation and on a test dataset (the EuroSIDA TCE database) to compare the performance of a newly derived lopinavir/r score with that of the 3 most widely used expert-based interpretation rules (ANRS, HIVDB and Rega). Our analysis identified mutations V82A, I54V, K20I and I62V, which were associated with reduced viral response and mutations I15V and V91S which determined lopinavir/r hypersensitivity. All models performed equally well (ASE on test ranging between 1.1 and 1.3, p = 0.34).We fully explored the potential of linear regression to construct a simple predictive model for lopinavir/r-based TCE. Although, the performance of our proposed score was similar to that of already existing IS, previously unrecognized lopinavir/r-associated mutations were identified. The analysis illustrates an approach of validation of expert-based IS that could be used in the future for other antiretrovirals and in other settings outside HIV research
ShoRAH: estimating the genetic diversity of a mixed sample from next-generation sequencing data
<p>Abstract</p> <p>Background</p> <p>With next-generation sequencing technologies, experiments that were considered prohibitive only a few years ago are now possible. However, while these technologies have the ability to produce enormous volumes of data, the sequence reads are prone to error. This poses fundamental hurdles when genetic diversity is investigated.</p> <p>Results</p> <p>We developed ShoRAH, a computational method for quantifying genetic diversity in a mixed sample and for identifying the individual clones in the population, while accounting for sequencing errors. The software was run on simulated data and on real data obtained in wet lab experiments to assess its reliability.</p> <p>Conclusions</p> <p>ShoRAH is implemented in C++, Python, and Perl and has been tested under Linux and Mac OS X. Source code is available under the GNU General Public License at <url>http://www.cbg.ethz.ch/software/shorah</url>.</p
Antiretroviral Therapy Optimisation without Genotype Resistance Testing: A Perspective on Treatment History Based Models
BACKGROUND: Although genotypic resistance testing (GRT) is recommended to guide combination antiretroviral therapy (cART), funding and/or facilities to perform GRT may not be available in low to middle income countries. Since treatment history (TH) impacts response to subsequent therapy, we investigated a set of statistical learning models to optimise cART in the absence of GRT information.
METHODS AND FINDINGS: The EuResist database was used to extract 8-week and 24-week treatment change episodes (TCE) with GRT and additional clinical, demographic and TH information. Random Forest (RF) classification was used to predict 8- and 24-week success, defined as undetectable HIV-1 RNA, comparing nested models including (i) GRT+TH and (ii) TH without GRT, using multiple cross-validation and area under the receiver operating characteristic curve (AUC). Virological success was achieved in 68.2% and 68.0% of TCE at 8- and 24-weeks (n\u200a=\u200a2,831 and 2,579), respectively. RF (i) and (ii) showed comparable performances, with an average (st.dev.) AUC 0.77 (0.031) vs. 0.757 (0.035) at 8-weeks, 0.834 (0.027) vs. 0.821 (0.025) at 24-weeks. Sensitivity analyses, carried out on a data subset that included antiretroviral regimens commonly used in low to middle income countries, confirmed our findings. Training on subtype B and validation on non-B isolates resulted in a decline of performance for models (i) and (ii).
CONCLUSIONS: Treatment history-based RF prediction models are comparable to GRT-based for classification of virological outcome. These results may be relevant for therapy optimisation in areas where availability of GRT is limited. Further investigations are required in order to account for different demographics, subtypes and different therapy switching strategies
Apc Mutation Enhances PyMT-Induced Mammary Tumorigenesis
The Adenomatous Polyposis Coli (APC) tumor suppressor gene is silenced by hypermethylation or mutated in up to 70% of human breast cancers. In mouse models, Apc mutation disrupts normal mammary development and predisposes to mammary tumor formation; however, the cooperation between APC and other mutations in breast tumorigenesis has not been studied. To test the hypothesis that loss of one copy of APC promotes oncogene-mediated mammary tumorigenesis, ApcMin/+ mice were crossed with the mouse mammary tumor virus (MMTV)-Polyoma virus middle T antigen (PyMT) or MMTV-c-Neu transgenic mice. In the PyMT tumor model, the ApcMin/+ mutation significantly decreased survival and tumor latency, promoted a squamous adenocarcinoma phenotype, and enhanced tumor cell proliferation. In tumor-derived cell lines, the proliferative advantage was a result of increased FAK, Src and JNK signaling. These effects were specific to the PyMT model, as no changes were observed in MMTV-c-Neu mice carrying the ApcMin/+ mutation. Our data indicate that heterozygosity of Apc enhances tumor development in an oncogene-specific manner, providing evidence that APC-dependent pathways may be valuable therapeutic targets in breast cancer. Moreover, these preclinical model systems offer a platform for dissection of the molecular mechanisms by which APC mutation enhances breast carcinogenesis, such as altered FAK/Src/JNK signaling
Comparison of HIV-1 Genotypic Resistance Test Interpretation Systems in Predicting Virological Outcomes Over Time
Background: Several decision support systems have been developed to interpret HIV-1 drug resistance genotyping results. This study compares the ability of the most commonly used systems (ANRS, Rega, and Stanford's HIVdb) to predict virological outcome at 12, 24, and 48 weeks. Methodology/Principal Findings: Included were 3763 treatment-change episodes (TCEs) for which a HIV-1 genotype was available at the time of changing treatment with at least one follow-up viral load measurement. Genotypic susceptibility scores for the active regimens were calculated using scores defined by each interpretation system. Using logistic regression, we determined the association between the genotypic susceptibility score and proportion of TCEs having an undetectable viral load (<50 copies/ml) at 12 (8-16) weeks (2152 TCEs), 24 (16-32) weeks (2570 TCEs), and 48 (44-52) weeks (1083 TCEs). The Area under the ROC curve was calculated using a 10-fold cross-validation to compare the different interpretation systems regarding the sensitivity and specificity for predicting undetectable viral load. The mean genotypic susceptibility score of the systems was slightly smaller for HIVdb, with 1.92±1.17, compared to Rega and ANRS, with 2.22±1.09 and 2.23±1.05, respectively. However, similar odds ratio's were found for the association between each-unit increase in genotypic susceptibility score and undetectable viral load at week 12; 1.6 [95% confidence interval 1.5-1.7] for HIVdb, 1.7 [1.5-1.8] for ANRS, and 1.7 [1.9-1.6] for Rega. Odds ratio's increased over time, but remained comparable (odds ratio's ranging between 1.9-2.1 at 24 weeks and 1.9-2.
Comparative determination of HIV-1 co-receptor tropism by Enhanced Sensitivity Trofile, gp120 V3-loop RNA and DNA genotyping
BACKGROUND: Trofile is the prospectively validated HIV-1 tropism assay. Its use is limited by high costs, long turn-around time, and inability to test patients with very low or undetectable viremia. We aimed at assessing the efficiency of population genotypic assays based on gp120 V3-loop sequencing for the determination of tropism in plasma viral RNA and in whole-blood viral DNA. Contemporary and follow-up plasma and whole-blood samples from patients undergoing tropism testing via the enhanced sensitivity Trofile (ESTA) were collected. Clinical and clonal geno2pheno[coreceptor] (G2P) models at 10% and at optimised 5.7% false positive rate cutoff were evaluated using viral DNA and RNA samples, compared against each other and ESTA, using Cohen's kappa, phylogenetic analysis, and area under the receiver operating characteristic (AUROC).
RESULTS: Both clinical and clonal G2P (with different false positive rates) showed good performances in predicting the ESTA outcome (for V3 RNA-based clinical G2P at 10% false positive rate AUROC = 0.83, sensitivity = 90%, specificity = 75%). The rate of agreement between DNA- and RNA-based clinical G2P was fair (kappa = 0.74, p < 0.0001), and DNA-based clinical G2P accurately predicted the plasma ESTA (AUROC = 0.86). Significant differences in the viral populations were detected when comparing inter/intra patient diversity of viral DNA with RNA sequences.
CONCLUSIONS: Plasma HIV RNA or whole-blood HIV DNA V3-loop sequencing interpreted with clinical G2P is cheap and can be a good surrogate for ESTA. Although there may be differences among viral RNA and DNA populations in the same host, DNA-based G2P may be used as an indication of viral tropism in patients with undetectable plasma viremia
Accurate and efficient gp120 V3 loop structure based models for the determination of HIV-1 co-receptor usage
<p>Abstract</p> <p>Background</p> <p>HIV-1 targets human cells expressing both the CD4 receptor, which binds the viral envelope glycoprotein gp120, as well as either the CCR5 (R5) or CXCR4 (X4) co-receptors, which interact primarily with the third hypervariable loop (V3 loop) of gp120. Determination of HIV-1 affinity for either the R5 or X4 co-receptor on host cells facilitates the inclusion of co-receptor antagonists as a part of patient treatment strategies. A dataset of 1193 distinct gp120 V3 loop peptide sequences (989 R5-utilizing, 204 X4-capable) is utilized to train predictive classifiers based on implementations of random forest, support vector machine, boosted decision tree, and neural network machine learning algorithms. An <it>in silico </it>mutagenesis procedure employing multibody statistical potentials, computational geometry, and threading of variant V3 sequences onto an experimental structure, is used to generate a feature vector representation for each variant whose components measure environmental perturbations at corresponding structural positions.</p> <p>Results</p> <p>Classifier performance is evaluated based on stratified 10-fold cross-validation, stratified dataset splits (2/3 training, 1/3 validation), and leave-one-out cross-validation. Best reported values of sensitivity (85%), specificity (100%), and precision (98%) for predicting X4-capable HIV-1 virus, overall accuracy (97%), Matthew's correlation coefficient (89%), balanced error rate (0.08), and ROC area (0.97) all reach critical thresholds, suggesting that the models outperform six other state-of-the-art methods and come closer to competing with phenotype assays.</p> <p>Conclusions</p> <p>The trained classifiers provide instantaneous and reliable predictions regarding HIV-1 co-receptor usage, requiring only translated V3 loop genotypes as input. Furthermore, the novelty of these computational mutagenesis based predictor attributes distinguishes the models as orthogonal and complementary to previous methods that utilize sequence, structure, and/or evolutionary information. The classifiers are available online at <url>http://proteins.gmu.edu/automute</url>.</p
Inferring viral quasispecies spectra from 454 pyrosequencing reads
<p>Abstract</p> <p>Background</p> <p>RNA viruses infecting a host usually exist as a set of closely related sequences, referred to as quasispecies. The genomic diversity of viral quasispecies is a subject of great interest, particularly for chronic infections, since it can lead to resistance to existing therapies. High-throughput sequencing is a promising approach to characterizing viral diversity, but unfortunately standard assembly software was originally designed for single genome assembly and cannot be used to simultaneously assemble and estimate the abundance of multiple closely related quasispecies sequences.</p> <p>Results</p> <p>In this paper, we introduce a new <b>Vi</b>ral <b>Sp</b>ectrum <b>A</b>ssembler (ViSpA) method for quasispecies spectrum reconstruction and compare it with the state-of-the-art ShoRAH tool on both simulated and real 454 pyrosequencing shotgun reads from HCV and HIV quasispecies. Experimental results show that ViSpA outperforms ShoRAH on simulated error-free reads, correctly assembling 10 out of 10 quasispecies and 29 sequences out of 40 quasispecies. While ShoRAH has a significant advantage over ViSpA on reads simulated with sequencing errors due to its advanced error correction algorithm, ViSpA is better at assembling the simulated reads after they have been corrected by ShoRAH. ViSpA also outperforms ShoRAH on real 454 reads. Indeed, 7 most frequent sequences reconstructed by ViSpA from a real HCV dataset are viable (do not contain internal stop codons), and the most frequent sequence was within 1% of the actual open reading frame obtained by cloning and Sanger sequencing. In contrast, only one of the sequences reconstructed by ShoRAH is viable. On a real HIV dataset, ShoRAH correctly inferred only 2 quasispecies sequences with at most 4 mismatches whereas ViSpA correctly reconstructed 5 quasispecies with at most 2 mismatches, and 2 out of 5 sequences were inferred without any mismatches. ViSpA source code is available at <url>http://alla.cs.gsu.edu/~software/VISPA/vispa.html</url>.</p> <p>Conclusions</p> <p>ViSpA enables accurate viral quasispecies spectrum reconstruction from 454 pyrosequencing reads. We are currently exploring extensions applicable to the analysis of high-throughput sequencing data from bacterial metagenomic samples and ecological samples of eukaryote populations.</p
- …