5,991 research outputs found

    Exploiting Conceptual Modeling for Searching Genomic Metadata: A Quantitative and Qualitative Empirical Study

    Get PDF
    Providing a common data model for the metadata of several heterogenous genomic data sources is hard, as they do not share any standard or agreed practice for metadata description. Two years ago we managed to discover a subset of common metadata present in most sources and to organize it as a smart genomic conceptual model (GCM); the model has been instrumental to our efforts in the development of a major software pipeline for data integration. More recently, we developed a user-friendly search interface, based on a simplified version of GCM. In this paper, we report our evaluation of the effectiveness of this new user interface. Specifically, we present the results of a compendious empirical study to answer the research question: How much is such a simple interface well-understood by a standard user? The target of this study is a mixed population, composed by biologists, bioinformaticians and computer scientists. The result of our empirical study shows that the users were successful in producing search queries starting from their natural language description, as they did it with good accuracy and small error rate. The study also shows that most users were generally satisfied; it provides indications on how to improve our search system and how to continue our effort in integration of genomic sources. We are consequently adapting the user interface, that will be soon opened to public use

    Conceptual models and databases for searching the genome

    Get PDF
    Genomics is an extremely complex domain, in terms of concepts, their relations, and their representations in data. This tutorial introduces the use of ER models in the context of genomic systems: conceptual models are of great help for simplifying this domain and making it actionable. We carry out a review of successful models presented in the literature for representing biologically relevant entities and grounding them in databases. We draw a difference between conceptual models that aim to explain the domain and conceptual models that aim to support database design and heterogeneous data integration. Genomic experiments and/or sequences are described by several metadata, specifying information on the sampled organism, the used technology, and the organizational process behind the experiment. Instead, we call data the actual regions of the genome that have been read by sequencing technologies and encoded into a machiner readable representation. First, we show how data and metadata can be modeled, then we exploit the proposed models for designing search systems, visualizers, and analysis environments. Both domains of human genomics and viral genomics are addressed, surveying several use cases and applications of broader public interest. The tutorial is relevant to the EDBT community because it demonstrates the usefulness of conceptual models’ principles within very current domains; in addition, it offers a concrete example of conceptual models’ use, setting the premises for interdisciplinary collaboration with a greater public (possibly including life science researchers)

    Combined population dynamics and entropy modelling supports patient stratification in chronic myeloid leukemia

    Get PDF
    Modelling the parameters of multistep carcinogenesis is key for a better understanding of cancer progression, biomarker identification and the design of individualized therapies. Using chronic myeloid leukemia (CML) as a paradigm for hierarchical disease evolution we show that combined population dynamic modelling and CML patient biopsy genomic analysis enables patient stratification at unprecedented resolution. Linking CD34+ similarity as a disease progression marker to patientderived gene expression entropy separated established CML progression stages and uncovered additional heterogeneity within disease stages. Importantly, our patient data informed model enables quantitative approximation of individual patients’ disease history within chronic phase (CP) and significantly separates “early” from “late” CP. Our findings provide a novel rationale for personalized and genome-informed disease progression risk assessment that is independent and complementary to conventional measures of CML disease burden and prognosis
    • …
    corecore