19 research outputs found

    Modelos multi-escala de inteligencia artificial para diseño quimio-informático y fármaco-epidemiológico de terapias anti-VIH en Condados de Estados Unidos

    Get PDF
    [Resumen]Los métodos que relacionan la estructura química con la actividad biológica se conocen como “relaciones cuantitativas estructura-actividad” (en adelante QSAR). Es fundamental entender y cuantificar la relación entre la estructura y la actividad biológica de los potenciales fármacos para realizar su estudio eficiente. Este tipo de estudio consiste en correlacionar, por medio de descriptores moleculares, distintas propiedades químicas o fisicoquímicas de las moléculas en cuestión con valores de actividad biológica. Actualmente, el desarrollo de medicamentos más seguros y efectivos en el tratamiento de enfermedades como el SIDA es un objetivo que requiere del esfuerzo de un elevado número de especialistas en diferentes campos de la Ciencia, y donde el azar ha tenido un gran protagonismo. Sin embargo, parece razonable pensar que nunca se obtendrán medicamentos eficaces y seguros con sólo acudir al azar. Para ser más eficientes en el desarrollo de nuevos fármacos, la investigación en el tratamiento de las enfermedades requiere poseer mecanismos predictivos de algunas actividades. Los modelos basados en “redes de neuronas artificiales” (en adelante RRNNAA) son un ejemplo de modelos teóricos de predicción, ampliamente utilizados en muchas áreas de la Ciencia, como medicina, química, bioquímica…, así como también en el desarrollo de medicamentos. En esto último, son muy útiles para la predicción de propiedades de los potenciales fármacos. Las RRNNAA se aproximan a la forma de operar que usa el cerebro humano, con habilidad para abordar con éxito los datos, las informaciones y los conocimientos naturales, o del mundo real, que están afectados por lo que se conoce como la “maldición de la cuádruple I”, por ser datos: inciertos, inconsistentes, incompletos e imprecisos. Esta particularidad hace que sean difíciles de gestionar adecuadamente por las técnicas computacionales convencionales, haciendo precisa la utilización de técnicas de Inteligencia Artificial, como son las ya citadas RRNNAA. La mayor ventaja de estos modelos inteligentes de predicción es que permiten evitar costes innecesarios producidos por desarrollos de nuevos compuestos con potencialidad terapéutica que resultarán estériles.Por lo tanto, el objetivo principal de la tesis aquí presentada es el desarrollo, con técnicas de inteligencia artificial, de una metodología “quimioinformática multi-escala” que permita relacionar cuantitativamente datos químicos y pre-clínicos con datos epidemiológicos, para llevar a cabo predicciones “fármaco-epidemiológicas”, teniendo en cuenta la imposibilidad práctica y legal de obtener datos experimentales, en la fase IV del proceso de desarrollo de nuevos compuestos[Resumo]Os métodos que relacionan a estrutura química coa actividade biolóxica son chamados “relacións cuantitativas estrutura – actividade” (en adiante QSAR). É esencial para entender e cuantificar a relación entre a estrutura e a actividade biolóxica dos potenciais fármacos para realizar o seu estudio eficiente. Este tipo de estudo consiste en correlacionar, a través de descritores moleculares, distintas propiedades químicas ou fisicoquímicas de las moleculas en cuestión, con valores de actividade biolóxica. Actualmente, o desenvolvemento de medicamentos máis seguros e efectivos no tratamento de enfermidades como o SIDA é un obxectivo que require do esforzo de un gran número de especialistas en diferentes campos da ciencia, e onde o azar tivo un gran protagonismo. Nembergantes, parece razoable pensar que nunca se obterían medicamentos eficaces e seguros con só acudir ao azar. Para ser máis eficaces no desenvolvemento de novos farmacos, a investigación para o tratamento de enfermidades require mecanismos preditivos de algunhas actividades. Os modelos baseados en redes neurais artificiais (en adiante RRNNAA) son un exemplo de modelos teóricos de predición amplamente utilizado en moitas áreas da ciencia, como medicina, química, bioquímica..., así como tamén no desenvolvemento de medicamentos. Nesto último, son moi útiles para a predición de propiedades dos potenciais medicamentos. As RRNNAA achegánse ao xeito de funcionar do cerebro humano, coa capacidade para abordar con éxito los datos, las informaciones y los conocimientos naturales, o del mundo real, que están afectados polo que se coñece como a “maldición da cuadrúple I”, por ser dados: incertos, inconsistentes, incompletos e imprecisos. Esta particularidade fai que sexan díficiles de xestionar axeitadamente coas técnicas computacionais convencionais, facendo preciso o uso de técnicas de Intelixencia Artificial, como son as xa citadas RRNNAA. A maior vantaxe destes modelos preditivos intelixentes é que permiten evitar custos innecesarios producidos polos desenvolvementos de novos compostos con potencial terapéutico que resultaran esteriles. Polo tanto o obxectivo principal da tese aquí presentada é o desenvolvemento, con tecnicas de intelixencia artificial dunha metodoloxía “quimioinformática multi-escala” que permita relacionar cuantitativamente datos químicos e pre-clínicos con datos epidemiolóxicos, para levar a cabo predicións fármaco-epidemiolóxicas, tendo en conta a imposibilidade práctica e legal de obter datos experimentais na fase IV do proceso de desenvolvemento de novos compostos.[Abstract]The methods relating chemical structure to biological activity are called “Quantitative Structure Activity Relationships” (QSAR). It is essential to understand and quantify the relationships between the structure and biological activity of potential drugs to develop an efficient study on them. This kind of study consists of the correlation of the molecular descriptors based on several chemical or physicochemical properties with biological activity. Currently, the development of safer and more effective drugs in the treatment of diseases such as AIDS is a goal that requires a joint effort of a large number of specialists from different fields of science, and where chance also has a major role. However, it seems reasonable that no effective and safe drugs will be obtained based on chance only. To be more efficient in developing new drugs, the research for the treatment of diseases requires predictive mechanisms of some biological activities. The models based on "Artificial Neural Networks" (ANNs) are an example of theoretical prediction models, widely used in many areas of science such as Medicine, Chemistry, Biochemistry, etc. as well as in Drug Development. In the latter, they are very useful for predicting properties of potential drugs. ANNs approach the modus operandi used by the human brain, being able to successfully manage data, information and natural knowledge, or from the real world, which are affected by the so-called "curse of the fourfold I", dealing with information which is uncertain, inconsistent, incomplete and inaccurate. This feature makes it difficult to properly manage by conventional computational techniques, making the use of Artificial Intelligence (AI) techniques necessary, such as the above-mentioned ANNs. The most important advantage of these intelligent prediction models is the fact that they avoid unnecessary production costs associated with the development of new compounds with therapeutic potential which proved to be inactive. Therefore, the main objective of the thesis is the development of a chemoinformatics multi-scale methodology using artificial intelligence techniques to quantitatively relate chemical and pre-clinical data with epidemiological data, with the aim of performing "drug - epidemiological" predictions, taking into account the practical and legal impossibility of obtaining experimental data in Phase IV of the development process of new compounds

    Quantitative approach for the risk assessment of African swine fever and Classical swine fever introduction into the United States through legal imports of pigs and swine products.

    Get PDF
    The US livestock safety strongly depends on its capacity to prevent the introduction of Transboundary Animal Diseases (TADs). Therefore, accurate and updated information on the location and origin of those potential TADs risks is essential, so preventive measures as market restrictions can be put on place. The objective of the present study was to evaluate the current risk of African swine fever (ASF) and Classical swine fever (CSF) introduction into the US through the legal importations of live pigs and swine products using a quantitative approach that could be later applied to other risks. Four quantitative stochastic risk assessment models were developed to estimate the monthly probabilities of ASF and CSF release into the US, and the exposure of susceptible populations (domestic and feral swine) to these introductions at state level. The results suggest a low annual probability of either ASF or CSF introduction into the US, by any of the analyzed pathways (5.5*10-3). Being the probability of introduction through legal imports of live pigs (1.8*10-3 for ASF, and 2.5*10-3 for CSF) higher than the risk of legally imported swine products (8.90*10-4 for ASF, and 1.56*10-3 for CSF). This could be caused due to the low probability of exposure associated with this type of commodity (products). The risk of feral pigs accessing to swine products discarded in landfills was slightly higher than the potential exposure of domestic pigs through swill feeding. The identification of the months at highest risk, the origin of the higher risk imports, and the location of the US states most vulnerable to those introductions (Iowa, Minnesota and Wisconsin for live swine and California, Florida and Texas for swine products), is valuable information that would help to design prevention, risk-mitigation and early-detection strategies that would help to minimize the catastrophic consequences of potential ASF/CSF introductions into the US

    Mapping networks of anti-HIV drug cocktails vs. AIDS epidemiology in the US counties

    Get PDF
    [Abstract] The implementation of the highly active antiretroviral therapy (HAART) and the combination of anti-HIV drugs have resulted in longer survival and a better quality of life for the people infected with the virus. In this work, a method is proposed to map complex networks of AIDS prevalence in the US counties, incorporating information about the chemical structure, molecular target, organism, and results in preclinical protocols of assay for all drugs in the cocktail. Different machine learning methods were trained and validated to select the best model. The Shannon information invariants of molecular graphs for drugs, and social networks of income inequality were used as input. The nodes in molecular graphs represent atoms weighed by Pauling electronegativity values, and the links correspond to the chemical bonds. On the other hand, the nodes in the social network represent the US counties and have Gini coefficients as weights. We obtained the data about anti-HIV drugs from the ChEMBL database and the data about AIDS prevalence and Gini coefficient from the AIDSVu database of Emory University. Box–Jenkins operators were used to measure the shift with respect to average behavior of drugs from reference compounds assayed with/in a given protocol, target, or organism. To train/validate the model and predict the complex network, we needed to analyze 152,628 data points including values of AIDS prevalence in 2310 counties in the US vs. ChEMBL results for 21,582 unique drugs, 9 viral or human protein targets, 4856 protocols, and 10 possible experimental measures. The best model found was a linear discriminant analysis (LDA) with accuracy, specificity, and sensitivity above 0.80 in training and external validation series.Ministerio de Educación, Cultura y Deportes; AGL2011-30563-C03-0

    Mapping chemical structure-activity information of HAART-drug cocktails over complex networks of AIDS epidemiology and socioeconomic data of U.S. counties

    Get PDF
    [Abstract] Using computational algorithms to design tailored drug cocktails for highly active antiretroviral therapy (HAART) on specific populations is a goal of major importance for both pharmaceutical industry and public health policy institutions. New combinations of compounds need to be predicted in order to design HAART cocktails. On the one hand, there are the biomolecular factors related to the drugs in the cocktail (experimental measure, chemical structure, drug target, assay organisms, etc.); on the other hand, there are the socioeconomic factors of the specific population (income inequalities, employment levels, fiscal pressure, education, migration, population structure, etc.) to study the relationship between the socioeconomic status and the disease. In this context, machine learning algorithms, able to seek models for problems with multi-source data, have to be used. In this work, the first artificial neural network (ANN) model is proposed for the prediction of HAART cocktails, to halt AIDS on epidemic networks of U.S. counties using information indices that codify both biomolecular and several socioeconomic factors. The data was obtained from at least three major sources. The first dataset included assays of anti-HIV chemical compounds released to ChEMBL. The second dataset is the AIDSVu database of Emory University. AIDSVu compiled AIDS prevalence for >2300 U.S. counties. The third data set included socioeconomic data from the U.S. Census Bureau. Three scales or levels were employed to group the counties according to the location or population structure codes: state, rural urban continuum code (RUCC) and urban influence code (UIC). An analysis of >130,000 pairs (network links) was performed, corresponding to AIDS prevalence in 2310 counties in U.S. vs. drug cocktails made up of combinations of ChEMBL results for 21,582 unique drugs, 9 viral or human protein targets, 4856 protocols, and 10 possible experimental measures. The best model found with the original data was a linear neural network (LNN) with AUROC > 0.80 and accuracy, specificity, and sensitivity ≈ 77% in training and external validation series. The change of the spatial and population structure scale (State, UIC, or RUCC codes) does not affect the quality of the model. Unbalance was detected in all the models found comparing positive/negative cases and linear/non-linear model accuracy ratios. Using synthetic minority over-sampling technique (SMOTE), data pre-processing and machine-learning algorithms implemented into the WEKA software, more balanced models were found. In particular, a multilayer perceptron (MLP) with AUROC = 97.4% and precision, recall, and F-measure >90% was found

    ANN multiscale model of anti-HIV Drugs activity vs AIDS prevalence in the US at county level based on information indices of molecular graphs and social networks

    Get PDF
    [Abstract] This work is aimed at describing the workflow for a methodology that combines chemoinformatics and pharmacoepidemiology methods and at reporting the first predictive model developed with this methodology. The new model is able to predict complex networks of AIDS prevalence in the US counties, taking into consideration the social determinants and activity/structure of anti-HIV drugs in preclinical assays. We trained different Artificial Neural Networks (ANNs) using as input information indices of social networks and molecular graphs. We used a Shannon information index based on the Gini coefficient to quantify the effect of income inequality in the social network. We obtained the data on AIDS prevalence and the Gini coefficient from the AIDSVu database of Emory University. We also used the Balaban information indices to quantify changes in the chemical structure of anti-HIV drugs. We obtained the data on anti-HIV drug activity and structure (SMILE codes) from the ChEMBL database. Last, we used Box-Jenkins moving average operators to quantify information about the deviations of drugs with respect to data subsets of reference (targets, organisms, experimental parameters, protocols). The best model found was a Linear Neural Network (LNN) with values of Accuracy, Specificity, and Sensitivity above 0.76 and AUROC > 0.80 in training and external validation series. This model generates a complex network of AIDS prevalence in the US at county level with respect to the preclinical activity of anti-HIV drugs in preclinical assays. To train/validate the model and predict the complex network we needed to analyze 43,249 data points including values of AIDS prevalence in 2,310 counties in the US vs ChEMBL results for 21,582 unique drugs, 9 viral or human protein targets, 4,856 protocols, and 10 possible experimental measures.Ministerio de Educación, Cultura y Deportes; AGL2011-30563-C03-0

    Description of input parameters and probabilities used in the quantitative models for the assessment of the risk of ASFV/ CSFV release into the US through legal imports of swine products.

    No full text
    <p>Note: the information about the inputs marked with* is listed in <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0182850#pone.0182850.t002" target="_blank">Table 2</a>.</p
    corecore