1,246 research outputs found

    The use of machine learning to improve the effectiveness of ANRS in predicting HIV drug resistance.

    Get PDF
    Master of TeleHealth in Medical Informatics. University of KwaZulu-Natal, Durban, 2016.BACKGROUD HIV has placed a large burden of disease in developing countries. HIV drug resistance is inevitable due to selective pressure. Computer algorithms have been proven to help in determining optimal treatment for HIV drug resistance patients. One such algorithm is the ANRS gold standard interpretation algorithm developed by the French National Agency for AIDS Research AC11 Resistance group. OBJECTIVES The aim of this study is to investigate the possibility of improving the accuracy of the ANRS gold standard in predicting HIV drug resistance. METHODS Data consisting of genome sequence and a HIV drug resistance measure was obtained from the Stanford HIV database. Machine learning factor analysis was performed to determine sequence positions where mutations lead to drug resistance. Sequence positions not found in ANRS were added to the ANRS rules and accuracy was recalculated. RESULTS The machine learning algorithm did find sequence positions, not associated with ANRS, but the model suggests they are important in the prediction of HIV drug resistance. Preliminary results show that for IDV 10 sequence positions where found that were not associated with ANRS rules, 4 for LPV, and 8 for NFV. For NFV, ANRS misclassified 74 resistant profiles as being susceptible to the ARV. Sixty eight of the 74 sequences (92%) were classified as resistance with the inclusion of the eight new sequence positions. No change was found for LPV and a 78% improvement was associated with IDV. CONCLUSION The study shows that there is a possibility of improving ANRS accuracy

    Learning Monotonic Genotype-Phenotype Maps

    Get PDF
    Evolutionary escape of pathogens from the selective pressure of immune responses and from medical interventions is driven by the accumulation of mutations. We introduce a statistical model for jointly estimating the dynamics and dependencies among genetic alterations and the associated phenotypic changes. The model integrates conjunctive Bayesian networks, which define a partial order on the occurrences of genetic events, with isotonic regression. The resulting genotype-phenotype map is non-decreasing in the lattice of genotypes. It describes evolutionary escape as a directed process following a phenotypic gradient, such as a monotonic fitness landscape. We present efficient algorithms for parameter estimation and model selection. The model is validated using simulated data and applied to HIV drug resistance data. We find that the effect of many resistance mutations is non-linear and depends on the genetic background in which they occu

    Current Perspectives on Viral Disease Outbreaks

    Get PDF
    The COVID-19 pandemic has reminded the world that infectious diseases are still important. The last 40 years have experienced the emergence of new or resurging viral diseases such as AIDS, ebola, MERS, SARS, Zika, and others. These diseases display diverse epidemiologies ranging from sexual transmission to vector-borne transmission (or both, in the case of Zika). This book provides an overview of recent developments in the detection, monitoring, treatment, and control of several viral diseases that have caused recent epidemics or pandemics

    Application of machine learning, molecular modelling and structural data mining against antiretroviral drug resistance in HIV-1

    Get PDF
    Millions are affected with the Human Immunodeficiency Virus (HIV) world wide, even though the death toll is on the decline. Antiretrovirals (ARVs), more specifically protease inhibitors have shown tremendous success since their introduction into therapy since the mid 1990’s by slowing down progression to the Acquired Immune Deficiency Syndrome (AIDS). However, Drug Resistance Mutations (DRMs) are constantly selected for due to viral adaptation, making drugs less effective over time. The current challenge is to manage the infection optimally with a limited set of drugs, with differing associated levels of toxicities in the face of a virus that (1) exists as a quasispecies, (2) may transmit acquired DRMs to drug-naive individuals and (3) that can manifest class-wide resistance due to similarities in design. The presence of latent reservoirs, unawareness of infection status, education and various socio-economic factors make the problem even more complex. Adequate timing and choice of drug prescription together with treatment adherence are very important as drug toxicities, drug failure and sub-optimal treatment regimens leave room for further development of drug resistance. While CD4 cell count and the determination of viral load from patients in resource-limited settings are very helpful to track how well a patient’s immune system is able to keep the virus in check, they can be lengthy in determining whether an ARV is effective. Phenosense assay kits answer this problem using viruses engineered to contain the patient sequences and evaluating their growth in the presence of different ARVs, but this can be expensive and too involved for routine checks. As a cheaper and faster alternative, genotypic assays provide similar information from HIV pol sequences obtained from blood samples, inferring ARV efficacy on the basis of drug resistance mutation patterns. However, these are inherently complex and the various methods of in silico prediction, such as Geno2pheno, REGA and Stanford HIVdb do not always agree in every case, even though this gap decreases as the list of resistance mutations is updated. A major gap in HIV treatment is that the information used for predicting drug resistance is mainly computed from data containing an overwhelming majority of B subtype HIV, when these only comprise about 12% of the worldwide HIV infections. In addition to growing evidence that drug resistance is subtype-related, it is intuitive to hypothesize that as subtyping is a phylogenetic classification, the more divergent a subtype is from the strains used in training prediction models, the less their resistance profiles would correlate. For the aforementioned reasons, we used a multi-faceted approach to attack the virus in multiple ways. This research aimed to (1) improve resistance prediction methods by focusing solely on the available subtype, (2) mine structural information pertaining to resistance in order to find any exploitable weak points and increase knowledge of the mechanistic processes of drug resistance in HIV protease. Finally, (3) we screen for protease inhibitors amongst a database of natural compounds [the South African natural compound database (SANCDB)] to find molecules or molecular properties usable to come up with improved inhibition against the drug target. In this work, structural information was mined using the Anisotropic Network Model, Dynamics Cross-Correlation, Perturbation Response Scanning, residue contact network analysis and the radius of gyration. These methods failed to give any resistance-associated patterns in terms of natural movement, internal correlated motions, residue perturbation response, relational behaviour and global compaction respectively. Applications of drug docking, homology-modelling and energy minimization for generating features suitable for machine-learning were not very promising, and rather suggest that the value of binding energies by themselves from Vina may not be very reliable quantitatively. All these failures lead to a refinement that resulted in a highly sensitive statistically-guided network construction and analysis, which leads to key findings in the early dynamics associated with resistance across all PI drugs. The latter experiment unravelled a conserved lateral expansion motion occurring at the flap elbows, and an associated contraction that drives the base of the dimerization domain towards the catalytic site’s floor in the case of drug resistance. Interestingly, we found that despite the conserved movement, bond angles were degenerate. Alongside, 16 Artificial Neural Network models were optimised for HIV proteases and reverse transcriptase inhibitors, with performances on par with Stanford HIVdb. Finally, we prioritised 9 compounds with potential protease inhibitory activity using virtual screening and molecular dynamics (MD) to additionally suggest a promising modification to one of the compounds. This yielded another molecule inhibiting equally well both opened and closed receptor target conformations, whereby each of the compounds had been selected against an array of multi-drug-resistant receptor variants. While a main hurdle was a lack of non-B subtype data, our findings, especially from the statistically-guided network analysis, may extrapolate to a certain extent to them as the level of conservation was very high within subtype B, despite all the present variations. This network construction method lays down a sensitive approach for analysing a pair of alternate phenotypes for which complex patterns prevail, given a sufficient number of experimental units. During the course of research a weighted contact mapping tool was developed to compare renin-angiotensinogen variants and packaged as part of the MD-TASK tool suite. Finally the functionality, compatibility and performance of the MODE-TASK tool were evaluated and confirmed for both Python2.7.x and Python3.x, for the analysis of normals modes from single protein structures and essential modes from MD trajectories. These techniques and tools collectively add onto the conventional means of MD analysis

    The epidemiology and impact of pretreatment HIV drug resistance in adults in South Africa.

    Get PDF
    Doctoral Degrees. University of KwaZulu-Natal, Durban.HIV drug resistance (HIVDR) present prior to initiating or re-initiating antiretroviral therapy (ART), is known as pretreatment drug resistance (PDR). Conventionally, PDR is detected by Sanger sequencing. Drug resistant minority variants (DRMVs) that are not reliably detected by Sanger sequencing can be detected by next generation sequencing. The aims of this research were to assess levels of PDR in HIV hyper-endemic areas (with high HIV incidence and prevalence) in KwaZulu-Natal (KZN) province, trends of PDR in South Africa, and the impact of DRMVs on ART. To assess PDR in adults from KZN hyper-endemic areas, 1845 sequences were analyzed from two population-based HIV surveillance studies; a longitudinal HIV surveillance programme in northern KZN (2013-2014), and the HIV Incidence Provincial Surveillance System (HIPSS) in central KZN (2014-2015). Overall, 182/1845 (10.0%) had NNRTI-PDR mutations, and when analyzed by study year, NNRTI-PDR was 10.2% (CI:7.5-12.9) for the HIPSS study in 2014. To assess PDR trends in South Africa, 6880 HIV-1 sequences were collated from 38 datasets of ART-naïve adults (2000-2016). Increasing levels of PDR were observed, most marked from 2010. Crude pooled prevalence of NNRTI-PDR reached 10% in 2014, with a 1.18-fold (CI:1.13- 1.23) annual increase (p<0.001), consistent with findings from the HIPSS data. This provided the first evidence of high-level NNRTI-PDR in KZN and South Africa, supporting the transition to dolutegravir in standard first-line ART, as recommended by the World Health Organization when NNRTI-PDR reaches ≥10%. A case-control (2:1) study in HIV/TB co-infected adult patients was done to assess the impact of DRMVs at different thresholds. Cases were patients that initiated ART and had viral loads ≥1000 copies/mL after ≥6 months on ART, and controls were those that initiated ART and achieved virologic suppression through 24 months. Pre-ART NNRTI-resistance was associated with ART failure. NGS improved detection of HIVDR at lower thresholds, but reduced the specificity of identifying patients at risk of virologic failure, with the specificity reducing from 97% (CI:92-99) at 20% threshold, to 79% (CI:71-86) at 2% threshold. In all, the findings presented in this thesis provide a broad message about the need to improve quality in HIV prevention and treatment services

    Measuring confidence of missing data estimation for HIV classification

    Get PDF
    Computational intelligence methods have been applied to classify pregnant women’s HIV status using demographic data from the South African Antenatal Seroprevalence database obtained from the South African Department of Health. Classification accuracies using a multitude of computational intelligence techniques ranged between 60% and 70%. The purpose of this research is to determine the certainty of predicting the HIV status of a patient. Ensemble neural networks were used for the investigation to obtain a set of possible solutions. The predictive certainty of each patients predicted HIV status was computed by giving the percentage of most dominant outputs from the set of possible solutions. Ensembles of neural networks were obtained using boosting, bagging and the Bayesian approach. It was found that the ensemble trained using the Bayesian approach is most suitable for the proposed predictive certainty measure. Furthermore, a sensitivity analysis was done to investigate how each of the demographic variables influenced the certainty of predicting the HIV status of a patien

    An Overview of the Use of Neural Networks for Data Mining Tasks

    Get PDF
    In the recent years the area of data mining has experienced a considerable demand for technologies that extract knowledge from large and complex data sources. There is a substantial commercial interest as well as research investigations in the area that aim to develop new and improved approaches for extracting information, relationships, and patterns from datasets. Artificial Neural Networks (NN) are popular biologically inspired intelligent methodologies, whose classification, prediction and pattern recognition capabilities have been utilised successfully in many areas, including science, engineering, medicine, business, banking, telecommunication, and many other fields. This paper highlights from a data mining perspective the implementation of NN, using supervised and unsupervised learning, for pattern recognition, classification, prediction and cluster analysis, and focuses the discussion on their usage in bioinformatics and financial data analysis tasks

    Generating Synthetic Clinical Data that Capture Class Imbalanced Distributions with Generative Adversarial Networks: Example using Antiretroviral Therapy for HIV

    Full text link
    Clinical data usually cannot be freely distributed due to their highly confidential nature and this hampers the development of machine learning in the healthcare domain. One way to mitigate this problem is by generating realistic synthetic datasets using generative adversarial networks (GANs). However, GANs are known to suffer from mode collapse thus creating outputs of low diversity. This lowers the quality of the synthetic healthcare data, and may cause it to omit patients of minority demographics or neglect less common clinical practices. In this paper, we extend the classic GAN setup with an additional variational autoencoder (VAE) and include an external memory to replay latent features observed from the real samples to the GAN generator. Using antiretroviral therapy for human immunodeficiency virus (ART for HIV) as a case study, we show that our extended setup overcomes mode collapse and generates a synthetic dataset that accurately describes severely imbalanced class distributions commonly found in real-world clinical variables. In addition, we demonstrate that our synthetic dataset is associated with a very low patient disclosure risk, and that it retains a high level of utility from the ground truth dataset to support the development of downstream machine learning algorithms.Comment: In the near future, we will make our codes and synthetic datasets publicly available to facilitate future research. Follow us on https://healthgym.ai
    corecore