3,314 research outputs found

    Antiretroviral Therapy Optimisation without Genotype Resistance Testing: A Perspective on Treatment History Based Models

    Get PDF
    BACKGROUND: Although genotypic resistance testing (GRT) is recommended to guide combination antiretroviral therapy (cART), funding and/or facilities to perform GRT may not be available in low to middle income countries. Since treatment history (TH) impacts response to subsequent therapy, we investigated a set of statistical learning models to optimise cART in the absence of GRT information. METHODS AND FINDINGS: The EuResist database was used to extract 8-week and 24-week treatment change episodes (TCE) with GRT and additional clinical, demographic and TH information. Random Forest (RF) classification was used to predict 8- and 24-week success, defined as undetectable HIV-1 RNA, comparing nested models including (i) GRT+TH and (ii) TH without GRT, using multiple cross-validation and area under the receiver operating characteristic curve (AUC). Virological success was achieved in 68.2% and 68.0% of TCE at 8- and 24-weeks (n\u200a=\u200a2,831 and 2,579), respectively. RF (i) and (ii) showed comparable performances, with an average (st.dev.) AUC 0.77 (0.031) vs. 0.757 (0.035) at 8-weeks, 0.834 (0.027) vs. 0.821 (0.025) at 24-weeks. Sensitivity analyses, carried out on a data subset that included antiretroviral regimens commonly used in low to middle income countries, confirmed our findings. Training on subtype B and validation on non-B isolates resulted in a decline of performance for models (i) and (ii). CONCLUSIONS: Treatment history-based RF prediction models are comparable to GRT-based for classification of virological outcome. These results may be relevant for therapy optimisation in areas where availability of GRT is limited. Further investigations are required in order to account for different demographics, subtypes and different therapy switching strategies

    Generating Synthetic Clinical Data that Capture Class Imbalanced Distributions with Generative Adversarial Networks: Example using Antiretroviral Therapy for HIV

    Full text link
    Clinical data usually cannot be freely distributed due to their highly confidential nature and this hampers the development of machine learning in the healthcare domain. One way to mitigate this problem is by generating realistic synthetic datasets using generative adversarial networks (GANs). However, GANs are known to suffer from mode collapse thus creating outputs of low diversity. This lowers the quality of the synthetic healthcare data, and may cause it to omit patients of minority demographics or neglect less common clinical practices. In this paper, we extend the classic GAN setup with an additional variational autoencoder (VAE) and include an external memory to replay latent features observed from the real samples to the GAN generator. Using antiretroviral therapy for human immunodeficiency virus (ART for HIV) as a case study, we show that our extended setup overcomes mode collapse and generates a synthetic dataset that accurately describes severely imbalanced class distributions commonly found in real-world clinical variables. In addition, we demonstrate that our synthetic dataset is associated with a very low patient disclosure risk, and that it retains a high level of utility from the ground truth dataset to support the development of downstream machine learning algorithms.Comment: In the near future, we will make our codes and synthetic datasets publicly available to facilitate future research. Follow us on https://healthgym.ai

    A Prognostic Model for Estimating the Time to Virologic Failure in HIV-1 Infected Patients Undergoing a New Combination Antiretroviral Therapy Regimen

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>HIV-1 genotypic susceptibility scores (GSSs) were proven to be significant prognostic factors of fixed time-point virologic outcomes after combination antiretroviral therapy (cART) switch/initiation. However, their relative-hazard for the time to virologic failure has not been thoroughly investigated, and an expert system that is able to predict how long a new cART regimen will remain effective has never been designed.</p> <p>Methods</p> <p>We analyzed patients of the Italian ARCA cohort starting a new cART from 1999 onwards either after virologic failure or as treatment-naïve. The time to virologic failure was the endpoint, from the 90<sup>th </sup>day after treatment start, defined as the first HIV-1 RNA > 400 copies/ml, censoring at last available HIV-1 RNA before treatment discontinuation. We assessed the relative hazard/importance of GSSs according to distinct interpretation systems (Rega, ANRS and HIVdb) and other covariates by means of Cox regression and random survival forests (RSF). Prediction models were validated via the bootstrap and c-index measure.</p> <p>Results</p> <p>The dataset included 2337 regimens from 2182 patients, of which 733 were previously treatment-naïve. We observed 1067 virologic failures over 2820 persons-years. Multivariable analysis revealed that low GSSs of cART were independently associated with the hazard of a virologic failure, along with several other covariates. Evaluation of predictive performance yielded a modest ability of the Cox regression to predict the virologic endpoint (c-index≈0.70), while RSF showed a better performance (c-index≈0.73, p < 0.0001 vs. Cox regression). Variable importance according to RSF was concordant with the Cox hazards.</p> <p>Conclusions</p> <p>GSSs of cART and several other covariates were investigated using linear and non-linear survival analysis. RSF models are a promising approach for the development of a reliable system that predicts time to virologic failure better than Cox regression. Such models might represent a significant improvement over the current methods for monitoring and optimization of cART.</p

    Support vector machine prediction of HIV-1 drug resistance using The Viral Nucleotide patterns

    Get PDF
    Student Number : 0213068F - MSc Dissertation - School of Computer Science - Faculty of ScienceDrug resistance of the HI virus due to its fast replication and error-prone mutation is a key factor in the failure to combat the HIV epidemic. For this reason, performing pre-therapy drug resistance testing and administering appropriate drugs or combination of drugs accordingly is very useful. There are two approaches to HIV drug resistance testing: phenotypic (clinical) and genotypic (based on the particular virus’s DNA). Genotyping tests HIV drug resistance by detecting specific mutations known to confer drug resistance. It is cheaper and can be computerised. However, it requires being able to know or learn what mutations confer drug resistance. Previous research using pattern recognition techniques has been promising, but the performance needs to be improved. It is also important for techniques that can quickly learn new rules when faced with new mutations or drugs. A relatively recent addition to these techniques is the Support Vector Machines (SVMs). SVMs have proved very successful in many benchmark applications such as face recognition, text recognition, and have also performed well in many computational biology problems where the number of features targeted is large compared to the number of available samples. This paper explores the use of SVMs in predicting the drug resistance of an HIV strain extracted from a patient based on the genetic sequence of those parts of the viral DNA encoding for the two enzymes, Reverse Transcriptase or Protease, which are critical for the replication of the HIV virus. In particular, it is the aim of this reseach to design the model without incorporating the biological knowledge at hand to enable the resulting classifier accommodate new drugs and mutations. To evaluate the performance of SVMs we used cross validation technique to measure the unbiased estimate on 2045 data points. The accuracy of classification and the area under the receiver operating characteristics curve (AUC) was used as a performance measure. Furthermore, to compare the performance of our SVMs model we also developed other prediction models based on popular classification algorithms, namely neural networks, decision trees and logistic regressions. The results show that SVMs are a highly successful classifier and out-perform other techniques with performance ranging between (94.13%–96.33%) accuracy and (81.26% - 97.49%) AUC. Decision trees were rated second and logistic regression performed the worst

    An investigation of multi-label classification techniques for predicting HIV drug resistance in resource-limited settings.

    Get PDF
    M. Sc. University of KwaZulu-Natal, Durban 2014.South Africa has one of the highest HIV infection rates in the world with more than 5.6 million infected people and consequently has the largest antiretroviral treatment program with more than 1.5 million people on treatment. The development of drug resistance is a major factor impeding the efficacy of antiretroviral treatment. While genotype resistance testing (GRT) is the standard method to determine resistance, access to these tests is limited in resource-limited settings. This research investigates the efficacy of multi-label machine learning techniques at predicting HIV drug resistance from routine treatment and laboratory data. Six techniques, namely, binary relevance, HOMER, MLkNN, predictive clustering trees (PCT), RAkEL and ensemble of classifier chains (ECC) have been tested and evaluated on data from medical records of patients enrolled in an HIV treatment failure clinic in rural KwaZulu-Natal in South Africa. The performance is measured using five scalar evaluation measures and receiver operating characteristic (ROC) curves. The techniques were found to provide useful predictive information in most cases. The PCT and ECC techniques perform best and have true positive prediction rates of 97% and 98% respectively for specific drugs. The ECC method also achieved an AUC value of 0:83, which is comparable to the current state of the art. All models have been validated using 10 fold cross validation and show increased performance when additional data is added. In order to make use of these techniques in the field, a tool is presented that may, with small modifications, be integrated into public HIV treatment programs in South Africa and could assist clinicians to identify patients with a high probability of drug resistance

    The Structural Basis for the Interdependence of Drug Resistance in the HIV-1 Protease

    Get PDF
    The human immunodeficiency virus type 1 (HIV-1) protease (PR) is a critical drug target as it is responsible for virion maturation. Mutations within the active site (1°) of the PR directly interfere with inhibitor binding while mutations distal to the active site (2°) to restore enzymatic fitness. Increasing mutation number is not directly proportional to the severity of resistance, suggesting that resistance is not simply additive but that it is interdependent. The interdependency of both primary and secondary mutations to drive protease inhibitor (PI) resistance is grossly understudied. To structurally and dynamically characterize the direct role of secondary mutations in drug resistance, I selected a panel of single-site mutant protease crystal structures complexed with the PI darunavir (DRV). From these studies, I developed a network hypothesis that explains how mutations outside the active site are able to perpetuate changes to the active site of the protease to disrupt inhibitor binding. I then expanded the panel to include highly mutated multi-drug resistant variants. To elucidate the interdependency between primary and secondary mutations I used statistical and machine-learning techniques to determine which specific mutations underlie the perturbations of key inter-molecular interactions. From these studies, I have determined that mutations distal to the active site are able to perturb the global PR hydrogen bonding patterns, while primary and secondary mutations cooperatively perturb hydrophobic contacts between the PR and DRV. Discerning and exploiting the mechanisms that underlie drug resistance in viral targets could proactively ameliorate both current treatment and inhibitor design for HIV-1 targets

    Can Linear Regression Modeling Help Clinicians in the Interpretation of Genotypic Resistance Data? An Application to Derive a Lopinavir-Score

    Get PDF
    The question of whether a score for a specific antiretroviral (e.g. lopinavir/r in this analysis) that improves prediction of viral load response given by existing expert-based interpretation systems (IS) could be derived from analyzing the correlation between genotypic data and virological response using statistical methods remains largely unanswered.We used the data of the patients from the UK Collaborative HIV Cohort (UK CHIC) Study for whom genotypic data were stored in the UK HIV Drug Resistance Database (UK HDRD) to construct a training/validation dataset of treatment change episodes (TCE). We used the average square error (ASE) on a 10-fold cross-validation and on a test dataset (the EuroSIDA TCE database) to compare the performance of a newly derived lopinavir/r score with that of the 3 most widely used expert-based interpretation rules (ANRS, HIVDB and Rega). Our analysis identified mutations V82A, I54V, K20I and I62V, which were associated with reduced viral response and mutations I15V and V91S which determined lopinavir/r hypersensitivity. All models performed equally well (ASE on test ranging between 1.1 and 1.3, p = 0.34).We fully explored the potential of linear regression to construct a simple predictive model for lopinavir/r-based TCE. Although, the performance of our proposed score was similar to that of already existing IS, previously unrecognized lopinavir/r-associated mutations were identified. The analysis illustrates an approach of validation of expert-based IS that could be used in the future for other antiretrovirals and in other settings outside HIV research

    The use of machine learning to improve the effectiveness of ANRS in predicting HIV drug resistance.

    Get PDF
    Master of TeleHealth in Medical Informatics. University of KwaZulu-Natal, Durban, 2016.BACKGROUD HIV has placed a large burden of disease in developing countries. HIV drug resistance is inevitable due to selective pressure. Computer algorithms have been proven to help in determining optimal treatment for HIV drug resistance patients. One such algorithm is the ANRS gold standard interpretation algorithm developed by the French National Agency for AIDS Research AC11 Resistance group. OBJECTIVES The aim of this study is to investigate the possibility of improving the accuracy of the ANRS gold standard in predicting HIV drug resistance. METHODS Data consisting of genome sequence and a HIV drug resistance measure was obtained from the Stanford HIV database. Machine learning factor analysis was performed to determine sequence positions where mutations lead to drug resistance. Sequence positions not found in ANRS were added to the ANRS rules and accuracy was recalculated. RESULTS The machine learning algorithm did find sequence positions, not associated with ANRS, but the model suggests they are important in the prediction of HIV drug resistance. Preliminary results show that for IDV 10 sequence positions where found that were not associated with ANRS rules, 4 for LPV, and 8 for NFV. For NFV, ANRS misclassified 74 resistant profiles as being susceptible to the ARV. Sixty eight of the 74 sequences (92%) were classified as resistance with the inclusion of the eight new sequence positions. No change was found for LPV and a 78% improvement was associated with IDV. CONCLUSION The study shows that there is a possibility of improving ANRS accuracy

    Molecular methods for the detection of infectious diseases: bringing diagnostics to the point-of-care

    Get PDF
    Human infectious diseases represent a leading cause of morbidity and mortality globally, caused by human-infective pathogens such as bacteria, viruses, parasites or fungi. Point-of-care (POC) diagnostics allow accessible, simple, and rapid identification of the organism causing the infection which is crucial for successful prognostic outcomes, clinical management, surveillance and isolation. The research conducted in this thesis aims to investigate novel methods for molecular-based diagnostics. This multidisciplinary project is divided into three main sections: (i) molecular methods for enhanced nucleic acid amplification, (ii) POC technologies, and (iii) sample preparation. The application, design and optimisation of loop-mediated isothermal amplification (LAMP) is investigated from a molecular perspective for the diagnostics of emerging infectious pathogens and antimicrobial resistance. LAMP assays were designed to target pathogens responsible for parasitic (malaria), bacterial and viral (COVID-19) infections, as well as antimicrobial resistance. A novel LAMP-based method for the detection of single nucleotide polymorphisms was developed and applied for diagnostics of antimicrobial resistance, emerging variants and genetic disorders. The method was validated for the detection of artemisinin-resistant malaria. Furthermore, this thesis reports the optimisation of LAMP from a biochemical perspective through the evaluation of its core reagents and the incorporation of enhancing agents to improve its specificity and sensitivity. In order to remove cold-chain storage from the diagnostic workflow, the optimised LAMP protocol was designed to be compatible with lyophilisation. Translation of LAMP to the POC demands the development of detection technologies that are compatible with the advantages offered by isothermal amplification. The use of simple, accessible and portable technologies is investigated in this thesis through the development of: (i) a novel colorimetricLAMP detection method for end-point and low cost detection, and (ii) the combination of LAMP with an electrochemical biosensing platform based on ion-sensitive field effect transistors (ISFETs) fabricated in unmodified complementary metal-oxide semiconductor (CMOS) technology for real-time detection. Lastly, current nucleic acid extraction methods are not transferable to be used outside the laboratory. Research of novel methods for low-cost and electricity-free sample preparation was carried out using cellulose matrices. A novel, rapid (under 10 min) and efficient nucleic acid extraction method from dried blood spots was developed. A sample-to-result POC test requires the implementation and integration of molecular biology, cutting-edge technology and data-driven approaches. The work presented in this thesis aims to set new benchmarks for the detection of infectious diseases at the POC by leveraging on developments in molecular biology and digital technologies.Open Acces

    Sobiva omaduste profiiliga ĂŒhendite tuvastamine keemiliste struktuuride andmekogudest

    Get PDF
    Keemiliste ĂŒhendite digitaalsete andmebaaside kasutuselevĂ”tuga kaasneb vajadus leida neist arvutuslikke vahendeid kasutades sobivate omadustega molekule. Probleem on eriti huvipakkuv ravimitööstuses, kus aja- ja ressursimahukate katsete asendamine arvutustega, vĂ”imaldab mĂ€rkimisvÀÀrset sÀÀstu. Kuigi tĂ€napĂ€evaste arvutusmeetodite piiratud vĂ”imsuse tĂ”ttu ei ole lĂ€hemas tulevikus vĂ”imalik kogu ravimidisaini protsessi algusest lĂ”puni arvutitesse ĂŒmber kolida, on lugu teine, kui vaadelda suuri andmekogusid. Arvutusmeetod, mis töötab teadaoleva statistilise vea piires, visates vĂ€lja mĂ”ne sobiva ĂŒhendi ja lugedes mĂ”ni ekslikult aktiivseks, tihendab lĂ”ppkokkuvĂ”ttes andmekomplekti tuntaval mÀÀral huvitavate ĂŒhendite suhtes. SeetĂ”ttu on ravimiarenduse lihtsamate ja vĂ€henĂ”udlikkumade etappide puhul, nagu juhtĂŒhendite vĂ”i ravimikandidaatide leidmine, edukalt vĂ”imalik rakendada arvutuslikke vahendeid. Selline tegevus on tuntud virtuaalsĂ”elumisena ning kĂ€esolevasse töösse on sellest avarast ja kiiresti arenevast valdkonnast valitud mĂ”ningad suunad, ning uuritud nende vĂ”imekust ja tulemuslikkust erinevate projektide raames. Töö tulemusena on valminud arvutusmudelid teatud tĂŒĂŒpi ĂŒhendite HIV proteaasi vastase aktiivsuse ja tsĂŒtotoksilisuse hindamiseks; koostatud uus sĂ”elumismeetod; leitud potentsiaalsed ligandid HIV proteaasile ja pöördtranskriptaasile; ning kokku pandud farmakokineetiliste filtritega eeltöödeldud andmekomplekt – mugav lĂ€htepositsioon edasisteks töödeks.With the implementation of digital chemical compound libraries, creates the need for finding compounds from them that fit the desired profile. The problem is of particular interest in drug design, where replacing the resource-intensive experiments with computational methods, would result in significant savings in time and cost. Although due to the limitations of current computational methods, it is not possible in foreseeable future to transfer all of the drug development process into computers, it is a different story with large molecular databases. An in silico method, working within a known error margin, is still capable of significantly concentrating the data set in terms of attractive compounds. That allows the use of computational methods in less stringent steps of drug development, such as finding lead compounds or drug candidates. This approach is known as virtual screening, and today it is a vast and prospective research area comprising of several paradigms and numerous individual methods. The present thesis takes a closer look on some of them, and evaluates their performance in the course of several projects. The results of the thesis include computational models to estimate the HIV protease inhibition activity and cytotoxicity of certain type of compounds; a few prospective ligands for HIV protease and reverse transcriptase; pre-filtered dataset of compounds – convenient starting point for subsequent projects; and finally a new virtual screening method was developed
    • 

    corecore