2,181 research outputs found

    Semantic interoperability: ontological unpacking of a viral conceptual model

    Get PDF
    Background. Genomics and virology are unquestionably important, but complex, domains being investigated by a large number of scientists. The need to facilitate and support work within these domains requires sharing of databases, although it is often difficult to do so because of the different ways in which data is represented across the databases. To foster semantic interoperability, models are needed that provide a deep understanding and interpretation of the concepts in a domain, so that the data can be consistently interpreted among researchers. Results. In this research, we propose the use of conceptual models to support semantic interoperability among databases and assess their ontological clarity to support their effective use. This modeling effort is illustrated by its application to the Viral Conceptual Model (VCM) that captures and represents the sequencing of viruses, inspired by the need to understand the genomic aspects of the virus responsible for COVID-19. For achieving semantic clarity on the VCM, we leverage the “ontological unpacking” method, a process of ontological analysis that reveals the ontological foundation of the information that is represented in a conceptual model. This is accomplished by applying the stereotypes of the OntoUML ontology-driven conceptual modeling language.As a result, we propose a new OntoVCM, an ontologically grounded model, based on the initial VCM, but with guaranteed interoperability among the data sources that employ it. Conclusions. We propose and illustrate how the unpacking of the Viral Conceptual Model resolves several issues related to semantic interoperability, the importance of which is recognized by the “I” in FAIR principles. The research addresses conceptual uncertainty within the domain of SARS-CoV-2 data and knowledge.The method employed provides the basis for further analyses of complex models currently used in life science applications, but lacking ontological grounding, subsequently hindering the interoperability needed for scientists to progress their research

    Conceptual Modeling Applied to Genomics: Challenges Faced in Data Loading

    Full text link
    Todays genomic domain evolves around insecurity: too many imprecise concepts, too much information to be properly managed. Considering that conceptualization is the most exclusive human characteristic, it makes full sense to try to conceptualize the principles that guide the essence of why humans are as we are. This question can of course be generalized to any species, but we are especially interested in this work in showing how conceptual modeling is strictly required to understand the ''execution model'' that human beings ''implement''. The main issue is to defend the idea that only by having an in-depth knowledge of the Conceptual Model that is associated to the Human Genome, can this Human Genome properly be understood. This kind of Model-Driven perspective of the Human Genome opens challenging possibilities, by looking at the individuals as implementation of that Conceptual Model, where different values associated to different modeling primitives will explain the diversity among individuals and the potential, unexpected variations together with their unwanted effects in terms of illnesses. This work focuses on the challenges faced in loading data from conventional resources into Information Systems created according to the above mentioned conceptual modeling approach. The work reports on various loading efforts, problems encountered and the solutions to these problems. Also, a strong argument is made about why conventional methods to solve the so called `data chaos¿ problems associated to the genomics domain so often fail to meet the demands.Van Der Kroon ., M. (2011). Conceptual Modeling Applied to Genomics: Challenges Faced in Data Loading. http://hdl.handle.net/10251/16993Archivo delegad

    Processing genome-wide association studies within a repository of heterogeneous genomic datasets

    Get PDF
    Background Genome Wide Association Studies (GWAS) are based on the observation of genome-wide sets of genetic variants – typically single-nucleotide polymorphisms (SNPs) – in different individuals that are associated with phenotypic traits. Research efforts have so far been directed to improving GWAS techniques rather than on making the results of GWAS interoperable with other genomic signals; this is currently hindered by the use of heterogeneous formats and uncoordinated experiment descriptions. Results To practically facilitate integrative use, we propose to include GWAS datasets within the META-BASE repository, exploiting an integration pipeline previously studied for other genomic datasets that includes several heterogeneous data types in the same format, queryable from the same systems. We represent GWAS SNPs and metadata by means of the Genomic Data Model and include metadata within a relational representation by extending the Genomic Conceptual Model with a dedicated view. To further reduce the gap with the descriptions of other signals in the repository of genomic datasets, we perform a semantic annotation of phenotypic traits. Our pipeline is demonstrated using two important data sources, initially organized according to different data models: the NHGRI-EBI GWAS Catalog and FinnGen (University of Helsinki). The integration effort finally allows us to use these datasets within multisample processing queries that respond to important biological questions. These are then made usable for multi-omic studies together with, e.g., somatic and reference mutation data, genomic annotations, epigenetic signals. Conclusions As a result of our work on GWAS datasets, we enable 1) their interoperable use with several other homogenized and processed genomic datasets in the context of the META-BASE repository; 2) their big data processing by means of the GenoMetric Query Language and associated system. Future large-scale tertiary data analysis may extensively benefit from the addition of GWAS results to inform several different downstream analysis workflows

    Scene Graph Generation with External Knowledge and Image Reconstruction

    Full text link
    Scene graph generation has received growing attention with the advancements in image understanding tasks such as object detection, attributes and relationship prediction,~\etc. However, existing datasets are biased in terms of object and relationship labels, or often come with noisy and missing annotations, which makes the development of a reliable scene graph prediction model very challenging. In this paper, we propose a novel scene graph generation algorithm with external knowledge and image reconstruction loss to overcome these dataset issues. In particular, we extract commonsense knowledge from the external knowledge base to refine object and phrase features for improving generalizability in scene graph generation. To address the bias of noisy object annotations, we introduce an auxiliary image reconstruction path to regularize the scene graph generation network. Extensive experiments show that our framework can generate better scene graphs, achieving the state-of-the-art performance on two benchmark datasets: Visual Relationship Detection and Visual Genome datasets.Comment: 10 pages, 5 figures, Accepted in CVPR 201

    How to deal with Haplotype data: An Extension to the Conceptual Schema of the Human Genome

    Full text link
    [EN] The goal of this work is to describe the advantages of the application of Conceptual Modeling (CM) in complex domains, such as genomics. Nowadays, the study and comprehension of the human genome is a major challenge due to its high level of complexity. The constant evolution in the genomic domain contributes to the generation of ever larger amounts of new data, which means that if we do not manage it correctly data quality could be compromised (i.e., problems related with heterogeneity and inconsistent data). In this paper, we propose the use of a Conceptual Schema of the Human Genome (CSHG), designed to understand and improve our ontological commitment to the domain and also extend (enrich) this schema with the integration of a novel concept: Haplotypes. Our focus is on improving the understanding of the relationship between genotype and phenotype, since new findings show that this question is more complex than was originally thought. Here we present the first steps in our data management approach with haplotypes (variations, frequencies and populations) and discuss the database evolution to support this data. Each new version in our conceptual schema (CS) introduces changes to the underlying database structure that has essential and practical implications for better understanding and managing the relevant information. A solution based on conceptual models gives a clear definition of the domain with direct implications in the medical field (Precision Medicine), in which Genomic Information Systems (GeIS) play a very important role.[ES] El objetivo de este trabajo es describir las ventajas de la aplicación del Modelado Conceptual (MC) en dominios complejos, como la genómica. Hoy en día, el estudio y comprensión del genoma humano es un desafío importante debido a su alto nivel de complejidad. La constante evolución en el dominio genómico contribuye a la generación de grandes cantidades de datos nuevos, lo que significa que, si no lo gestionamos correctamente, la calidad de los mismos podría verse comprometida (por ejemplo: problemas relacionados con la heterogeneidad e inconsistencia de datos). En este trabajo se propone el uso de un Esquema Conceptual del Genoma Humano (ECGH), diseñado para comprender y mejorar nuestro compromiso ontológico con el dominio y extender este esquema con la integración de un nuevo concepto: Haplotipos. Nuestro objetivo es mejorar la comprensión de la relación entre genotipo-y-fenotipo, ya que los nuevos hallazgos muestran que este tema es mucho más complejo de lo que se pensaba originalmente. Aquí presentamos los primeros pasos en nuestro enfoque de gestión de datos haplotípicos (variaciones, frecuencias y poblaciones) y discutimos la evolución de la base de datos para apoyar dichos datos. Cada nueva versión de nuestro esquema conceptual (EC) introduce cambios en la estructura de la base de datos subyacente, que tiene implicaciones esenciales y prácticas con el fin de facilitar una mejor comprensión y gestión de la información relevante. Una solución basada en modelos conceptuales brinda una definición más clara del dominio con implicaciones directas en el campo médico (Medicina de precisión), en la que los Sistemas de Información Genómicos (GeIS) desempeñan un papel muy importante.The authors thanks to the members of the PROS Center Genome group for fruitful discussions. This work has been supported by the Ministry of Higher Education, Science and Technology (MESCyT) of the Dominican Republic, and It also has the support of Generalitat Valenciana through project IDEO (PROMETEOII/2014/039)Reyes Román, JF.; Pastor López, O.; Roldán Martínez, D.; Valverde Giromé, F. (2016). How to deal with Haplotype data: An Extension to the Conceptual Schema of the Human Genome. CLEI Electronic Journal. 19(3):1-21. https://doi.org/10.19153/cleiej.19.3.2S12119

    Design and Development of an Information System to Manage Clinical Data about Usher Syndrome Based on Conceptual Modeling

    Full text link
    [EN] The inefficient management of clinical data in many research environments is a problem which slows down the service provided to patients. The benefits of an Information System created following the conceptual modeling rules have been proved in multiple environments with data management difficulties. The main hurdle to overcome is the large gap between the language and concepts employed by informaticians and the ones used by biologists. The work described in this paper shows how these technologies can also be applied to the clinical domain, after a long period of mutual approaching in order to understand each other. The research clinical data of an expert research group on Usher syndrome have been studied, analyzed and redesigned using conceptual modeling, helping this group to offer a better service.It is important to highlight that this work has been done under the framework of the Cátedra Tecnologías para la Salud of the Universitat Politècnica de València financed by INDRA Systems.Burriel Coll, V.; Pastor Cubillo, MÁ.; Celma Giménez, M.; Casamayor Rodenas, JC.; Mota Herranz, L. (2013). Design and Development of an Information System to Manage Clinical Data about Usher Syndrome Based on Conceptual Modeling. IARIA XPS Press. http://hdl.handle.net/10251/75237
    corecore