57 research outputs found

    Biopax and Semantics

    Get PDF
    Biopax community is producing sets of data in RDF files, but most of them are not available through query interfaces. The publication of SPARQL endpoints is feasible with current sets of data, but the use of reasoning in these interfaces is unfeasible in many cases. The use of large scale reasoners is a need to take advantage of these data sets

    A Service for Flexible Management and Analysis of Heterogeneous Clinical Data

    Get PDF
    Este documento describe FIMED 2.0, un servicio para la gestión flexible y análisis de datos clínicos heterogéneos. Esta herramienta de software permite la gestión flexible de datos clínicos de múltiples ensayos, lo que puede ayudar a mejorar la calidad de los datos clínicos y facilitar los ensayos clínicos. El servicio propuesto se ha desarrollado sobre una base de datos NoSQL (MongoDB) que permite recoger e integrar los datos clínicos en esquemas dinámicos e incrementales en función de sus necesidades y de los requisitos de la investigación clínica. requisitos de la investigación clínica. Basándonos en nuestras experiencias con la Gestión Flexible de Datos Biomédicos (FIMED), hemos desarrollado esta nueva versión de la herramienta con el objetivo no sólo de replicar la anterior, sino también de incluir más análisis de redes reguladoras de genes y visualización de datos orientados a anotar la funcionalidad de los genes e identificar los genes centrales. Esta versión permite Esta versión permite al profesional utilizar cuatro métodos diferentes de construcción de redes, como como la asimilación de datos, la interpolación lineal, el conjunto basado en árboles o la regresión Boosting Machine. Puede encontrar una versión gratuita de esta herramienta en la web https://khaos.uma.es/fimedV2. Se ha creado una cuenta de usuario de demostración para proporcionar una demostración de usuario, "iwbbio", utilizando la contraseña "demo". Un caso de uso real para un ensayo clínico en la enfermedad del melanoma también se incluye en esta demostración, que sí ha sido anonimizada.Universidad de Málaga. Campus de Excelencia Internacional Andalucía Tec

    Bioqueries: a collaborative environment to create, explore and share SPARQL queries in Life Sciences

    Get PDF
    Bioqueries provides a collaborative environment to create, explore, execute, clone and share SPARQL queries (including Federated Queries). Federated SPARQL queries can retrieve information from more than one data source.Universidad de Málaga. Campus de Excelencia Internacional Andalucía Tech

    Melanoma expression analysis with Big Data technologies

    Get PDF
    Melanoma is a highly immunogenic tumor. Therefore, in recent years physicians have incorporated drugs that alter the immune system into their therapeutic arsenal against this disease, revolutionizing in the treatment of patients in an advanced stage of the disease. This has led us to explore and deepen our knowledge of the immunology surrounding melanoma, in order to optimize its approach. At present, immunotherapy for metastatic melanoma is based on stimulating an individual’s own immune system through the use of specific monoclonal antibodies. The use of immunotherapy has meant that many of patients with melanoma have survived and therefore it constitutes a present and future treatment in this field. At the same time, drugs have been developed targeting specific mutations, specifically BRAF, resulting in large responses in tumor regression (set up in this clinical study to 18 months), as well as a higher percentage of long-term survivors. The analysis of the gene expression changes and their correlation with clinical changes can be developed using the tools provided by those companies which currently provide gene expression platforms. The gene expression platform used in this clinical study is NanoString, which provides nCounter. However, nCounter has some limitations as the type of analysis is restricted to a predefined set, and the introduction of clinical features is a complex task. This paper presents an approach to collect the clinical information using a structured database and a Web user interface to introduce this information, including the results of the gene expression measurements, to go a step further than the nCounter tool. As part of this work, we present an initial analysis of changes in the gene expression of a set of patients before and after targeted therapy. This analysis has been carried out using Big Data technologies (Apache Spark) with the final goal being to scale up to large numbers of patients, even though this initial study has a limited number of enrolled patients (12 in the first analysis). This is not a Big Data problem, but the underlaying study aims at targeting 20 patients per year just in Málaga, and this could be extended to be used to analyze the 3.600 patients diagnosed with melanoma per year.Universidad de Málaga. Campus de Excelencia Internacional Andalucía Tech. This work was funded in part by Grants TIN2014-58304-R (Ministerio de Ciencia e Innovación) and P11-TIC-7529 and P12-TIC-1519 (Plan Andaluz de Investigación, Desarrollo e Innovación). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript

    KNIT: Ontology reusability through knowledge graph exploration

    Get PDF
    Ontologies have become a standard for knowledge representation across several domains. In Life Sciences, numerous ontologies have been introduced to represent human knowledge, often providing overlapping or conflicting perspectives. These ontologies are usually published as OWL or OBO, and are often registered in open repositories, e.g., BioPortal. However, the task of finding the concepts (classes and their properties) defined in the existing ontologies and the relationships between these concepts across different ontologies – for example, for developing a new ontology aligned with the existing ones – requires a great deal of manual effort in searching through the public repositories for candidate ontologies and their entities. In this work, we develop a new tool, KNIT, to automatically explore open repositories to help users fetch the previously designed concepts using keywords. User-specified keywords are then used to retrieve matching names of classes or properties. KNIT then creates a draft knowledge graph populated with the concepts and relationships retrieved from the existing ontologies. Furthermore, following the process of ontology learning, our tool refines this first draft of an ontology. We present three BioPortal-specific use cases for our tool. These use cases outline the development of new knowledge graphs and ontologies in the sub-domains of biology: genes and diseases, virome and drugs.This work has been funded by grant PID2020-112540RB-C4121, AETHER-UMA (A smart data holistic approach for context-aware data analytics: semantics and context exploitation). Funding for open access charge: Universidad de Málaga / CBUA

    Ensemble-based genetic algorithm explainer with automized image segmentation: A case study on melanoma detection dataset

    Get PDF
    Explainable Artificial Intelligence (XAI) makes AI understandable to the human user particularly when the model is complex and opaque. Local Interpretable Model-agnostic Explanations (LIME) has an image explainer package that is used to explain deep learning models. The image explainer of LIME needs some parameters to be manually tuned by the expert in advance, including the number of top features to be seen and the number of superpixels in the segmented input image. This parameter tuning is a time-consuming task. Hence, with the aim of developing an image explainer that automizes image segmentation, this paper proposes Ensemblebased Genetic Algorithm Explainer (EGAE) for melanoma cancer detection that automatically detects and presents the informative sections of the image to the user. EGAE has three phases. First, the sparsity of chromosomes in GAs is determined heuristically. Then, multiple GAs are executed consecutively. However, the difference between these GAs are in different number of superpixels in the input image that result in different chromosome lengths. Finally, the results of GAs are ensembled using consensus and majority votings. This paper also introduces how Euclidean distance can be used to calculate the distance between the actual explanation (delineated by experts) and the calculated explanation (computed by the explainer) for accuracy measurement. Experimental results on a melanoma dataset show that EGAE automatically detects informative lesions, and it also improves the accuracy of explanation in comparison with LIME efficiently. The python codes for EGAE, the ground truths delineated by clinicians, and the melanoma detection dataset are available at https://github.com/KhaosResearch/EGAEThis work has been partially funded by grant PID2020-112540RBC41 (funded by MCIN/AEI/10.13039/501100011033/, Spain), AETHERUMA, Spain (A smart data holistic approach for context-aware data analytics: semantics and context exploitation). Funding for open access charge: Universidad de Málaga/CBUA. Additionally, we thank Dr. Miguel Ángel Berciano Guerrero from Unidad de Oncología Intercentros, Hospitales Univesitarios Regional Virgen de la Victoria de Málaga, and Instituto de Investigaciones Biomédicas (IBIMA), Málaga, Spain, for his support in images selection and general medical orientation in the particular case of Melanoma

    YeastMed: An XML-Based System for Biological Data Integration of Yeast

    Get PDF
    A key goal of bioinformatics is to create database systems and software platforms capable of storing and analysing large sets of biological data. Hundreds of biological databases are now available and provide access to huge amount of biological data. SGD, Yeastract, CYGD-MIPS, BioGrid and PhosphoGrid are five of the most visited databases by the yeast community. These sources provide complementary data on biological entities. Biologists are brought systematically to query these data sources in order to analyse the results of their experiments. Because of the heterogeneity of these sources, querying them separately and then manually combining the returned result is a complex and laborious task. To provide transparent and simultaneous access to these sources, we have developed a mediator-based system called YeastMed. In this paper, we present YeastMed focusing on its architecture
    corecore