704 research outputs found

    Framework for Prediction of Response to HCV Therapy Using Different Data Mining Techniques

    Get PDF
    Hepatitis C which is a widely spread disease all over the world is a fatal liver disease caused by Hepatitis C Virus (HCV). The only approved therapy is interferon plus ribavirin. The number of responders to this treatment is low, while its cost is high and side effects are undesirable. Treatment response prediction will help in reducing the patients who suffer from the side effects and high costs without achieving recovery. The aim of this research is to develop a framework which can select the best model to predict HCV patients' response to the treatment of HCV from clinical information. The framework contains three phases which are preprocessing phase to prepare the data for applying Data Mining (DM) techniques, DM phase to apply different DM techniques, and evaluation phase to evaluate and compare the performance of the built models and select the best model as the recommended one. Different DM techniques had been applied which are associative classification, artificial neural network, and decision tree to evaluate the framework. The experimental results showed the effectiveness of the framework in selecting the best model which is the model built by associative classification using histology activity index, fibrosis stage, and alanine amino transferase

    Novel hybridized computational paradigms integrated with five stand-alone algorithms for clinical prediction of HCV status among patients: A data-driven technique

    Get PDF
    The emergence of health informatics opens new opportunities and doors for different disease diagnoses. The current work proposed the implementation of five different stand-alone techniques coupled with four different novel hybridized paradigms for the clinical prediction of hepatitis C status among patients, using both sociodemographic and clinical input variables. Both the visualized and quantitative performances of the stand-alone algorithms present the capability of the Gaussian process regression (GPR), Generalized neural network (GRNN), and Interactive linear regression (ILR) over the Support Vector Regression (SVR) and Adaptive neuro-fuzzy inference system (ANFIS) models. Hence, due to the lower performance of the stand-alone algorithms at a certain point, four different novel hybrid data intelligent algorithms were proposed, including: interactive linear regression-Gaussian process regression (ILR-GPR), interactive linear regression-generalized neural network (ILR-GRNN), interactive linear regression-Support Vector Regression (ILR-SVR), and interactive linear regression-adaptive neuro-fuzzy inference system (ILR-ANFIS), to boost the prediction accuracy of the stand-alone techniques in the clinical prediction of hepatitis C among patients. Based on the quantitative prediction skills presented by the novel hybridized paradigms, the proposed techniques were able to enhance the performance efficiency of the single paradigms up to 44% and 45% in the calibration and validation phases, respectively.Operational Research Centre in Healthcare, Near East University, North Cyprus, Mersin-10, Turkiy

    Agrupamiento, predicción y clasificación ordinal para series temporales utilizando técnicas de machine learning: aplicaciones

    Get PDF
    In the last years, there has been an increase in the number of fields improving their standard processes by using machine learning (ML) techniques. The main reason for this is that the vast amount of data generated by these processes is difficult to be processed by humans. Therefore, the development of automatic methods to process and extract relevant information from these data processes is of great necessity, giving that these approaches could lead to an increase in the economic benefit of enterprises or to a reduction in the workload of some current employments. Concretely, in this Thesis, ML approaches are applied to problems concerning time series data. Time series is a special kind of data in which data points are collected chronologically. Time series are present in a wide variety of fields, such as atmospheric events or engineering applications. Besides, according to the main objective to be satisfied, there are different tasks in the literature applied to time series. Some of them are those on which this Thesis is mainly focused: clustering, classification, prediction and, in general, analysis. Generally, the amount of data to be processed is huge, arising the need of methods able to reduce the dimensionality of time series without decreasing the amount of information. In this sense, the application of time series segmentation procedures dividing the time series into different subsequences is a good option, given that each segment defines a specific behaviour. Once the different segments are obtained, the use of statistical features to characterise them is an excellent way to maximise the information of the time series and simultaneously reducing considerably their dimensionality. In the case of time series clustering, the objective is to find groups of similar time series with the idea of discovering interesting patterns in time series datasets. In this Thesis, we have developed a novel time series clustering technique. The aim of this proposal is twofold: to reduce as much as possible the dimensionality and to develop a time series clustering approach able to outperform current state-of-the-art techniques. In this sense, for the first objective, the time series are segmented in order to divide the them identifying different behaviours. Then, these segments are projected into a vector of statistical features aiming to reduce the dimensionality of the time series. Once this preprocessing step is done, the clustering of the time series is carried out, with a significantly lower computational load. This novel approach has been tested on all the time series datasets available in the University of East Anglia and University of California Riverside (UEA/UCR) time series classification (TSC) repository. Regarding time series classification, two main paths could be differentiated: firstly, nominal TSC, which is a well-known field involving a wide variety of proposals and transformations applied to time series. Concretely, one of the most popular transformation is the shapelet transform (ST), which has been widely used in this field. The original method extracts shapelets from the original time series and uses them for classification purposes. Nevertheless, the full enumeration of all possible shapelets is very time consuming. Therefore, in this Thesis, we have developed a hybrid method that starts with the best shapelets extracted by using the original approach with a time constraint and then tunes these shapelets by using a convolutional neural network (CNN) model. Secondly, time series ordinal classification (TSOC) is an unexplored field beginning with this Thesis. In this way, we have adapted the original ST to the ordinal classification (OC) paradigm by proposing several shapelet quality measures taking advantage of the ordinal information of the time series. This methodology leads to better results than the state-of-the-art TSC techniques for those ordinal time series datasets. All these proposals have been tested on all the time series datasets available in the UEA/UCR TSC repository. With respect to time series prediction, it is based on estimating the next value or values of the time series by considering the previous ones. In this Thesis, several different approaches have been considered depending on the problem to be solved. Firstly, the prediction of low-visibility events produced by fog conditions is carried out by means of hybrid autoregressive models (ARs) combining fixed-size and dynamic windows, adapting itself to the dynamics of the time series. Secondly, the prediction of convective cloud formation (which is a highly imbalance problem given that the number of convective cloud events is much lower than that of non-convective situations) is performed in two completely different ways: 1) tackling the problem as a multi-objective classification task by the use of multi-objective evolutionary artificial neural networks (MOEANNs), in which the two conflictive objectives are accuracy of the minority class and the global accuracy, and 2) tackling the problem from the OC point of view, in which, in order to reduce the imbalance degree, an oversampling approach is proposed along with the use of OC techniques. Thirdly, the prediction of solar radiation is carried out by means of evolutionary artificial neural networks (EANNs) with different combinations of basis functions in the hidden and output layers. Finally, the last challenging problem is the prediction of energy flux from waves and tides. For this, a multitask EANN has been proposed aiming to predict the energy flux at several prediction time horizons (from 6h to 48h). All these proposals and techniques have been corroborated and discussed according to physical and atmospheric models. The work developed in this Thesis is supported by 11 JCR-indexed papers in international journals (7 Q1, 3 Q2, 1 Q3), 11 papers in international conferences, and 4 papers in national conferences

    Development of a Hepatitis C Virus knowledgebase with computational prediction of functional hypothesis of therapeutic relevance

    Get PDF
    Philosophiae Doctor - PhDTo ameliorate Hepatitis C Virus (HCV) therapeutic and diagnostic challenges requires robust intervention strategies, including approaches that leverage the plethora of rich data published in biomedical literature to gain greater understanding of HCV pathobiological mechanisms. The multitudes of metadata originating from HCV clinical trials as well as low and high-throughput experiments embedded in text corpora can be mined as data sources for the implementation of HCV-specific resources. HCV-customized resources may support the generation of worthy and testable hypothesis and reveal potential research clues to augment the pursuit of efficient diagnostic biomarkers and therapeutic targets. This research thesis report the development of two freely available HCV-specific web-based resources: (i) Dragon Exploratory System on Hepatitis C Virus (DESHCV) accessible via http://apps.sanbi.ac.za/DESHCV/ or http://cbrc.kaust.edu.sa/deshcv/ and (ii) Hepatitis C Virus Protein Interaction Database (HCVpro) accessible via http://apps.sanbi.ac.za/hcvpro/ or http://cbrc.kaust.edu.sa/hcvpro/. DESHCV is a text mining system implemented using named concept recognition and cooccurrence based approaches to computationally analyze about 32, 000 HCV related abstracts obtained from PubMed. As part of DESHCV development, the pre-constructed dictionaries of the Dragon Exploratory System (DES) were enriched with HCV biomedical concepts, including HCV proteins, name variants and symbols to enable HCV knowledge specific exploration. The DESHCV query inputs consist of user-defined keywords, phrases and concepts. DESHCV is therefore an information extraction tool that enables users to computationally generate association between concepts and support the prediction of potential hypothesis with diagnostic and therapeutic relevance. Additionally, users can retrieve a list of abstracts containing tagged concepts that can be used to overcome the herculean task of manual biocuration. DESHCV has been used to simulate previously reported thalidomide-chronic hepatitis C hypothesis and also to model a potentially novel thalidomide-amantadine hypothesis. HCVpro is a relational knowledgebase dedicated to housing experimentally detected HCV-HCV and HCV-human protein interaction information obtained from other databases and curated from biomedical journal articles. Additionally, the database contains consolidated biological information consisting of hepatocellular carcinoma (HCC) related genes, comprehensive reviews on HCV biology and drug development, functional genomics and molecular biology data, and cross-referenced links to canonical pathways and other essential biomedical databases. Users can retrieve enriched information including interaction metadata from HCVpro by using protein identifiers, gene chromosomal locations, experiment types used in detecting the interactions, PubMed IDs of journal articles reporting the interactions, annotated protein interaction IDs from external databases, and via “string searches”. The utility of HCVpro has been demonstrated by harnessing integrated data to suggest putative baseline clues that seem to support current diagnostic exploratory efforts directed towards vimentin. Furthermore, eight genes comprising of ACLY, AZGP1, DDX3X, FGG, H19, SIAH1, SERPING1 and THBS1 have been recommended for possible investigation to evaluate their diagnostic potential. The data archived in HCVpro can be utilized to support protein-protein interaction network-based candidate HCC gene prioritization for possible validation by experimental biologists.South Afric

    Development of a hepatitis C virus knowledgebase with computational prediction of functional hypothesis of therapeutic relevance

    Get PDF
    Philosophiae Doctor - PhDTo ameliorate Hepatitis C Virus (HCV) therapeutic and diagnostic challenges requires robust intervention strategies, including approaches that leverage the plethora of rich data published in biomedical literature to gain greater understanding of HCV pathobiological mechanisms. The multitudes of metadata originating from HCV clinical trials as well as low and high-throughput experiments embedded in text corpora can be mined as data sources for the implementation of HCV-specific resources. HCV-customized resources may support the generation of worthy and testable hypothesis and reveal potential research clues to augment the pursuit of efficient diagnostic biomarkers and therapeutic targets. This research thesis report the development of two freely available HCV-specific web-based resources: (i) Dragon Exploratory System on Hepatitis C Virus (DESHCV) accessible via http://apps.sanbi.ac.za/DESHCV/ or http://cbrc.kaust.edu.sa/deshcv/ and(ii) Hepatitis C Virus Protein Interaction Database (HCVpro) accessible via http://apps.sanbi.ac.za/hcvpro/ or http://cbrc.kaust.edu.sa/hcvpro/.DESHCV is a text mining system implemented using named concept recognition and cooccurrence based approaches to computationally analyze about 32, 000 HCV related abstracts obtained from PubMed. As part of DESHCV development, the pre-constructed dictionaries of the Dragon Exploratory System (DES) were enriched with HCV biomedical concepts, including HCV proteins, name variants and symbols to enable HCV knowledge specific exploration. The DESHCV query inputs consist of user-defined keywords, phrases and concepts. DESHCV is therefore an information extraction tool that enables users to computationally generate association between concepts and support the prediction of potential hypothesis with diagnostic and therapeutic relevance.Additionally, users can retrieve a list of abstracts containing tagged concepts that can be used to overcome the herculean task of manual biocuration. DESHCV has been used to simulate previously reported thalidomide-chronic hepatitis C hypothesis and also to model a potentially novel thalidomide-amantadine hypothesis.HCVpro is a relational knowledgebase dedicated to housing experimentally detected HCV-HCV and HCV-human protein interaction information obtained from other databases and curated from biomedical journal articles. Additionally, the database contains consolidated biological information consisting of hepatocellular carcinoma(HCC) related genes, comprehensive reviews on HCV biology and drug development,functional genomics and molecular biology data, and cross-referenced links to canonical pathways and other essential biomedical databases. Users can retrieve enriched information including interaction metadata from HCVpro by using protein identifiers,gene chromosomal locations, experiment types used in detecting the interactions, PubMed IDs of journal articles reporting the interactions, annotated protein interaction IDs from external databases, and via “string searches”. The utility of HCVpro has been demonstrated by harnessing integrated data to suggest putative baseline clues that seem to support current diagnostic exploratory efforts directed towards vimentin. Furthermore,eight genes comprising of ACLY, AZGP1, DDX3X, FGG, H19, SIAH1, SERPING1 and THBS1 have been recommended for possible investigation to evaluate their diagnostic potential. The data archived in HCVpro can be utilized to support protein-protein interaction network-based candidate HCC gene prioritization for possible validation by experimental biologists

    The Intersection Of Chronic Hepatitis C Infection And Cardiovascular Disease

    Get PDF
    Hepatitis C virus (HCV) infection is highly prevalent in the US. Though its primary sequelae are liver-related, extrahepatic manifestations contribute to the overall morbidity and mortality of infection. Disorders of lipid metabolism, chronic inflammation and immune dysregulation resulting from chronic infection provide a milieu for extrahepatic manifestations. We sought to examine the role of modification of lipid metabolism on HCV viral load, and to determine the relative contribution of chronic HCV infection to cardiovascular disease. We first examined the effect of 3-hydroxy-3-methylglutaryl-CoA reductase inhibitors (statins) on HCV viral load. We found that, on average, treatment with at least 30 days of a statin was associated with a lower HCV viral load than that observed in those unexposed to statins. Additionally, while long-term follow up was not available, statin therapy was not associated with an increased incidence of liver injury. In the second study, we analyzed data from The Health Improvement Network (THIN) to determine if chronic HCV infection was independently associated with incident myocardial infarction (MI). We found no association between chronic HCV infection and incident MI after adjustment for demographics, comorbidities, medication exposures, body mass index (BMI), tobacco use and family history of MI. Additionally, use of a composite cardiovascular endpoint, characterization of medication exposures as time-varying, and accounting for receipt of HCV therapy did not change our findings. In the conduct of the second study, missing data for BMI were imputed using multiple imputation models. For the third study, we examined whether different variable selection approaches for specification of multiple imputation models result in more or less accurate prediction of investigator-simulated missingness for BMI in THIN. Variable selection procedures utilized to predict missingness included insertion of investigator-chosen variables, use of a high-dimensional approach including all administrative data if a statistical threshold was met, and feature selection driven by machine learning algorithms. We found that the high-dimensional and machine learning approaches, while able to incorporate all data elements, resulted in small improvements in bias but were computationally onerous. The small gains in accuracy achieved with the new methods need to be weighed against the costs of implementation

    Hepatitis C virus molecular evolution: Transmission, disease progression and antiviral therapy

    Get PDF
    Hepatitis C virus (HCV) infection represents an important public health problem worldwide. Reduction of HCV morbidity and mortality is a current challenge owned to several viral and host factors. Virus molecular evolution plays an important role in HCV transmission, disease progression and therapy outcome. The high degree of genetic heterogeneity characteristic of HCV is a key element for the rapid adaptation of the intrahost viral population to different selection pressures (e.g., host immune responses and antiviral therapy). HCV molecular evolution is shaped by different mechanisms including a high mutation rate, genetic bottlenecks, genetic drift, recombination, temporal variations and compartmentalization. These evolutionary processes constantly rearrange the composition of the HCV intrahost population in a staging manner. Remarkable advances in the understanding of the molecular mechanism controlling HCV replication have facilitated the development of a plethora of direct-acting antiviral agents against HCV. As a result, superior sustained viral responses have been attained. The rapidly evolving field of anti-HCV therapy is expected to broad its landscape even further with newer, more potent antivirals, bringing us one step closer to the interferon-free era.Fil: Preciado, María Victoria. Gobierno de la Ciudad de Buenos Aires. Hospital General de Niños ; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas; ArgentinaFil: Valva, Pamela. Consejo Nacional de Investigaciones Científicas y Técnicas; Argentina. Gobierno de la Ciudad de Buenos Aires. Hospital General de Niños ; ArgentinaFil: Escobar Gutierrez, Alejandro. Instituto de Diagnóstico y Referencia Epidemiológicos; MéxicoFil: Rahal, Paula. Universidade Estadual Paulista Julio de Mesquita Filho; BrasilFil: Ruiz Tovar, Karina. Instituto de Diagnóstico y Referencia Epidemiológicos; MéxicoFil: Yamasaki, Lilian. Universidade Estadual Paulista Julio de Mesquita Filho; BrasilFil: Vazquez Chacon, Carlos. Instituto de Diagnóstico y Referencia Epidemiológicos; MéxicoFil: Martinez Guarneros, Armando. Instituto de Diagnóstico y Referencia Epidemiológicos; MéxicoFil: Carpio Pedroza, Juan Carlos. Instituto de Diagnóstico y Referencia Epidemiológicos; MéxicoFil: Fonseca Coronado, Salvador. Universidad Nacional Autónoma de México; MéxicoFil: Cruz Rivera, Mayra. Universidad Nacional Autónoma de México; Méxic

    Systematic evaluation of immune regulation and modulation

    Get PDF
    Cancer immunotherapies are showing promising clinical results in a variety of malignancies. Monitoring the immune as well as the tumor response following these therapies has led to significant advancements in the field. Moreover, the identification and assessment of both predictive and prognostic biomarkers has become a key component to advancing these therapies. Thus, it is critical to develop systematic approaches to monitor the immune response and to interpret the data obtained from these assays. In order to address these issues and make recommendations to the field, the Society for Immunotherapy of Cancer reconvened the Immune Biomarkers Task Force. As a part of this Task Force, Working Group 3 (WG3) consisting of multidisciplinary experts from industry, academia, and government focused on the systematic assessment of immune regulation and modulation. In this review, the tumor microenvironment, microbiome, bone marrow, and adoptively transferred T cells will be used as examples to discuss the type and timing of sample collection. In addition, potential types of measurements, assays, and analyses will be discussed for each sample. Specifically, these recommendations will focus on the unique collection and assay requirements for the analysis of various samples as well as the high-throughput assays to evaluate potential biomarkers

    Computational approaches for improving treatment and prevention of viral infections

    Get PDF
    The treatment of infections with HIV or HCV is challenging. Thus, novel drugs and new computational approaches that support the selection of therapies are required. This work presents methods that support therapy selection as well as methods that advance novel antiviral treatments. geno2pheno[ngs-freq] identifies drug resistance from HIV-1 or HCV samples that were subjected to next-generation sequencing by interpreting their sequences either via support vector machines or a rules-based approach. geno2pheno[coreceptor-hiv2] determines the coreceptor that is used for viral cell entry by analyzing a segment of the HIV-2 surface protein with a support vector machine. openPrimeR is capable of finding optimal combinations of primers for multiplex polymerase chain reaction by solving a set cover problem and accessing a new logistic regression model for determining amplification events arising from polymerase chain reaction. geno2pheno[ngs-freq] and geno2pheno[coreceptor-hiv2] enable the personalization of antiviral treatments and support clinical decision making. The application of openPrimeR on human immunoglobulin sequences has resulted in novel primer sets that improve the isolation of broadly neutralizing antibodies against HIV-1. The methods that were developed in this work thus constitute important contributions towards improving the prevention and treatment of viral infectious diseases.Die Behandlung von HIV- oder HCV-Infektionen ist herausfordernd. Daher werden neue Wirkstoffe, sowie neue computerbasierte Verfahren benötigt, welche die Therapie verbessern. In dieser Arbeit wurden Methoden zur Unterstützung der Therapieauswahl entwickelt, aber auch solche, welche neuartige Therapien vorantreiben. geno2pheno[ngs-freq] bestimmt, ob Resistenzen gegen Medikamente vorliegen, indem es Hochdurchsatzsequenzierungsdaten von HIV-1 oder HCV Proben mittels Support Vector Machines oder einem regelbasierten Ansatz interpretiert. geno2pheno[coreceptor-hiv2] bestimmt den HIV-2 Korezeptorgebrauch dadurch, dass es einen Abschnitt des viralen Oberflächenproteins mit einer Support Vector Machine analysiert. openPrimeR kann optimale Kombinationen von Primern für die Multiplex-Polymerasekettenreaktion finden, indem es ein Mengenüberdeckungsproblem löst und auf ein neues logistisches Regressionsmodell für die Vorhersage von Amplifizierungsereignissen zurückgreift. geno2pheno[ngs-freq] und geno2pheno[coreceptor-hiv2] ermöglichen die Personalisierung antiviraler Therapien und unterstützen die klinische Entscheidungsfindung. Durch den Einsatz von openPrimeR auf humanen Immunoglobulinsequenzen konnten Primersätze generiert werden, welche die Isolierung von breit neutralisierenden Antikörpern gegen HIV-1 verbessern. Die in dieser Arbeit entwickelten Methoden leisten somit einen wichtigen Beitrag zur Verbesserung der Prävention und Therapie viraler Infektionskrankheiten
    corecore