37 research outputs found

    Resolución de problemas de optimización combinatoria utilizando técnicas de computación evolutiva: una aplicación a la biomedicina

    Get PDF
    [Resumen] Cada día se genera una mayor cantidad de datos, tanto con respecto a su volumen como por el número de variables que involucran, lo cual representa un problema para las técnicas tradicionales. En muchos problemas el conjunto de soluciones posibles es tan elevado que la localización de una solución óptima es imposible en un tiempo razonable, por lo que es necesario emplear técnicas basadas en heurísticas. Se ha observado que las técnicas de computación evolutiva (CE) proporcionan resultados satisfactorios en situaciones en que técnicas tradicionales no los obtuvieron, en especial en su aplicación a datos biomédicos y relacionados con el diagnóstico de enfermedades. Así, en este trabajo se ha desarrollado un modelo basado en CE capaz de, a partir de unos datos de entrada etiquetados como sujetos sanos o enfermos, extraer expresiones con las que construir un modelo de clasificación. Este modelo ha sido validado tanto contra datos sintéticos como aplicado a un conjunto de datos clínicos reales, además de comparar sus resultados con métodos similares. Es de destacar que el modelo propuesto obtiene expresiones sencillas y que logra clasificar ambos tipos de conjuntos mejor que el resto de técnicas, resultando de gran utilidad como apoyo al diagnóstico clínico.[Resumo] Cada día xérase unha maior cantidade de datos, tanto con respecto ao seu volume como polo número de variables que involucran, o cal representa un problema para as técnicas tradicionais. En moitos problemas o conxunto de solucións posibles é tan elevado que a localización dunha solución óptima é imposible nun tempo razoable, polo que é necesario empregar técnicas baseadas en heurísticas. Observouse que as técnicas de computación evolutiva (CE) proporcionan resultados satisfactorios en situacións en que técnicas tradicionais non os obtiveron, en especial na súa aplicación a datos biomédicos e relacionados co diagnóstico de enfermidades. Así, neste traballo desenvolveuse un modelo baseado en CE capaz de, a partir duns datos de entrada etiquetados como suxeitos sans ou enfermos, extraer expresións coas que construír un modelo de clasificación. Este modelo foi validado tanto contra datos sintéticos como aplicado a un conxunto de datos clínicos reais, ademais de comparar os seus resultados con métodos similares. Compre destacar que o modelo proposto obtén expresións sinxelas e que logra clasificar ambos tipos de conxuntos mellor co resto de técnicas, resultando de gran utilidade como apoio ó diagnóstico clínico.[Abstract] Every day more data are being generated. Not only the volume of data increases, but also the number of variables does. This represents an issue for traditional techniques. Furthermore, many problems involve such a large set of possible solutions that finding the optimal solution in a reasonable amount of time is not feasible. Thus, using techniques based on heuristics becomes necessary. Evolutionary Computation (EC) has provided good results in situations in which traditional techniques did not, especially when applied to biomedical data and disease diagnosis. Therefore, in this work, a model based on EC has been developed. This model, based on an input set with data that belong to healthy or diseased subjects, is capable of extracting expressions in order to build a classification model. The model proposed in this thesis has been validated on generated data, as well as applied to real clinical data, comparing the results obtained with those of other similar techniques. It is worth pointing out that the model presented extracts simple expressions and performs better when classifying both types of data sets than other existing techniques. As a result, the model presented is expected to be very useful for clinical diagnostic support

    ATria: a novel centrality algorithm applied to biological networks

    Get PDF
    Background The notion of centrality is used to identify ?important? nodes in social networks. Importance of nodes is not well-defined, and many different notions exist in the literature. The challenge of defining centrality in meaningful ways when network edges can be positively or negatively weighted has not been adequately addressed in the literature. Existing centrality algorithms also have a second shortcoming, i.e., the list of the most central nodes are often clustered in a specific region of the network and are not well represented across the network. Methods We address both by proposing Ablatio Triadum (ATria), an iterative centrality algorithm that uses the concept of ?payoffs? from economic theory. Results We compare our algorithm with other known centrality algorithms and demonstrate how ATria overcomes several of their shortcomings. We demonstrate the applicability of our algorithm to synthetic networks as well as biological networks including bacterial co-occurrence networks, sometimes referred to as microbial social networks. Conclusions We show evidence that ATria identifies three different kinds of ?important? nodes in microbial social networks with different potential roles in the community

    Exploring Patterns of Epigenetic Information With Data Mining Techniques

    Get PDF
    [Abstract] Data mining, a part of the Knowledge Discovery in Databases process (KDD), is the process of extracting patterns from large data sets by combining methods from statistics and artificial intelligence with database management. Analyses of epigenetic data have evolved towards genome-wide and high-throughput approaches, thus generating great amounts of data for which data mining is essential. Part of these data may contain patterns of epigenetic information which are mitotically and/or meiotically heritable determining gene expression and cellular differentiation, as well as cellular fate. Epigenetic lesions and genetic mutations are acquired by individuals during their life and accumulate with ageing. Both defects, either together or individually, can result in losing control over cell growth and, thus, causing cancer development. Data mining techniques could be then used to extract the previous patterns. This work reviews some of the most important applications of data mining to epigenetics.Programa Iberoamericano de Ciencia y Tecnología para el Desarrollo; 209RT-0366Galicia. Consellería de Economía e Industria; 10SIN105004PRInstituto de Salud Carlos III; RD07/0067/000

    Using Genetic Algorithms for Automatic Recurrent ANN Development: an Application to EEG Signal Classification

    Get PDF
    [Abstract] ANNs are one of the most successful learning systems. For this reason, many techniques have been published that allow the obtaining of feed-forward networks. However, fe w works describe techniques for developing recurrent networks. This work uses a genetic algorithm for automatic recurrent ANN devel opment. This system has been applied to solve a well-known problem: classi fication of EEG signals from epileptic patients. Results show the high performance of this system, and its ability to develop simple networks, with a low number of neurons and connections.Red Gallega de Investigación sobre Cáncer Colorrectal; ref. 2009/58Programa Ibeoramericano de Ciencia y Tecnología para el Desarrollo; 209RT0366Ministerio de Industria, Turismo y Comercio; TSI-020110-2009-53Xunta de Galicia; 10SIN105004PRInstituto de Salud Carlos III; PIO52048Instituto de Salud Carlos III; RD07/0067/000

    Random Forest Classification Based on Star Graph Topological Indices for Antioxidant Proteins

    Get PDF
    [Abstract] Aging and life quality is an important research topic nowadays in areas such as life sciences, chemistry, pharmacology, etc. People live longer, and, thus, they want to spend that extra time with a better quality of life. At this regard, there exists a tiny subset of molecules in nature, named antioxidant proteins that may influence the aging process. However, testing every single protein in order to identify its properties is quite expensive and inefficient. For this reason, this work proposes a model, in which the primary structure of the protein is represented using complex network graphs that can be used to reduce the number of proteins to be tested for antioxidant biological activity. The graph obtained as a representation will help us describe the complex system by using topological indices. More specifically, in this work, Randić’s Star Networks have been used as well as the associated indices, calculated with the S2SNet tool. In order to simulate the existing proportion of antioxidant proteins in nature, a dataset containing 1999 proteins, of which 324 are antioxidant proteins, was created. Using this data as input, Star Graph Topological Indices were calculated with the S2SNet tool. These indices were then used as input to several classification techniques. Among the techniques utilised, the Random Forest has shown the best performance, achieving a score of 94% correctly classified instances. Although the target class (antioxidant proteins) represents a tiny subset inside the dataset, the proposed model is able to achieve a percentage of 81.8% correctly classified instances for this class, with a precision of 81.3%.Galicia. Consellería de Economía e Industria; 10SIN105004PRGalicia. Consellería de Economía e Industria; O9SIN010105PRMinisterio de Economía y Competitividad; TIN-2009-0770

    Electronic Health Records Exploitation Using Artificial Intelligence Techniques

    Get PDF
    [Abstract] The exploitation of electronic health records (EHRs) has multiple utilities, from predictive tasks and clinical decision support to pattern recognition. Artificial Intelligence (AI) allows to extract knowledge from EHR data in a practical way. In this study, we aim to construct a Machine Learning model from EHR data to make predictions about patients. Specifically, we will focus our analysis on patients suffering from respiratory problems. Then, we will try to predict whether those patients will have a relapse in less than 6, 12 or 18 months. The main objective is to identify the characteristics that seem to increase the relapse risk. At the same time, we propose an exploratory analysis in search of hidden patterns among data. These patterns will help us to classify patients according to their specific conditions for some clinical variables.Centro de Investigación de Galicia CITIC is funded by Consellería de Educación, Universidades e Formación Profesional from Xunta de Galicia and European Union (European Regional Development Fund—FEDER Galicia 2014-2020 Program) by grant ED431G 2019/01. Partially supported by the Spanish Ministry of Science (Challenges of Society 2019) PID2019-104323RB-C33Xunta de Galicia; ED431G 2019/0

    Applied Computational Techniques on Schizophrenia Using Genetic Mutations

    Get PDF
    [Abstract] Schizophrenia is a complex disease, with both genetic and environmental influence. Machine learning techniques can be used to associate different genetic variations at different genes with a (schizophrenic or non-schizophrenic) phenotype. Several machine learning techniques were applied to schizophrenia data to obtain the results presented in this study. Considering these data, Quantitative Genotype – Disease Relationships (QDGRs) can be used for disease prediction. One of the best machine learning-based models obtained after this exhaustive comparative study was implemented online; this model is an artificial neural network (ANN). Thus, the tool offers the possibility to introduce Single Nucleotide Polymorphism (SNP) sequences in order to classify a patient with schizophrenia. Besides this comparative study, a method for variable selection, based on ANNs and evolutionary computation (EC), is also presented. This method uses half the number of variables as the original ANN and the variables obtained are among those found in other publications. In the future, QDGR models based on nucleic acid information could be expanded to other diseases.Programa Iberoamericano de Ciencia y Tecnología para el Desarrollo; 209RT-0366Xunta de Galicia; 10SIN105004PRInstituto de Salud Carlos III; RD07/0067/0005Xunta de Galicia; Ref. 2009/5

    Metagenomics, Metatranscriptomics, and Metabolomics Approaches for Microbiome Analysis

    Get PDF
    Microbiomes are ubiquitous and are found in the ocean, the soil, and in/on other living organisms. Changes in the microbiome can impact the health of the environmental niche in which they reside. In order to learn more about these communities, different approaches based on data from mul-tiple omics have been pursued. Metagenomics produces a taxonomical profile of the sample, metatranscriptomics helps us to obtain a functional profile, and metabolomics completes the picture by determining which byproducts are being released into the environment. Although each approach provides valuable information separately, we show that, when combined, they paint a more comprehensive picture. We conclude with a review of network-based approaches as applied to integrative studies, which we believe holds the key to in-depth understanding of microbiomes

    SNP locator: a candidate SNP selection tool

    Get PDF
    [Abstract] In this work, a data integration approach using a federated model based on a service oriented architecture (SOA) is presented. The BioMOBY middleware was used to implement each service which is part of the integration process. As an example of usage of this architecture, a web tool for candidate SNP selection has been developed. Thus, several BioMOBY services have been created as the model layer of the web application. Each data source has a wrapper which communicates with the federated model, that is, the BioMOBY model, and this model is the one that interacts with the client.Red Gallega de Investigación sobre Cáncer Colorrectal; Ref. 2009/58Programa Iberoamericano de Ciencia y Tecnología para el Desarrollo; 209RT-0366Instituto de Salud Carlos III; PIO52048Instituto de Salud Carlos III; RD07/0067/0005Galicia. Consellería de Economía e Industria ; 10SIN105004PRMinisterio de Industria, Turismo y Comercio; TSI-020110-2009-5
    corecore