3 research outputs found

    A digital repository with an extensible data model for biobanking and genomic analysis management

    Get PDF
    Motivation: Molecular biology laboratories require extensive metadata to improve data collection and analysis. The heterogeneity of the collected metadata grows as research is evolving in to international multi-disciplinary collaborations and increasing data sharing among institutions. Single standardization is not feasible and it becomes crucial to develop digital repositories with flexible and extensible data models, as in the case of modern integrated biobanks management. Results: We developed a novel data model in JSON format to describe heterogeneous data in a generic biomedical science scenario. The model is built on two hierarchical entities: processes and events, roughly corresponding to research studies and analysis steps within a single study. A number of sequential events can be grouped in a process building up a hierarchical structure to track patient and sample history. Each event can produce new data. Data is described by a set of user-defined metadata, and may have one or more associated files. We integrated the model in a web based digital repository with a data grid storage to manage large data sets located in geographically distinct areas. We built a graphical interface that allows authorized users to define new data types dynamically, according to their requirements. Operators compose queries on metadata fields using a flexible search interface and run them on the database and on the grid. We applied the digital repository to the integrated management of samples, patients and medical history in the BIT-Gaslini biobank. The platform currently manages 1800 samples of over 900 patients. Microarray data from 150 analyses are stored on the grid storage and replicated on two physical resources for preservation. The system is equipped with data integration capabilities with other biobanks for worldwide information sharing. Conclusions: Our data model enables users to continuously define flexible, ad hoc, and loosely structured metadata, for information sharing in specific research projects and purposes. This approach can improve sensitively interdisciplinary research collaboration and allows to track patients' clinical records, sample management information, and genomic data. The web interface allows the operators to easily manage, query, and annotate the files, without dealing with the technicalities of the data grid.Peer reviewe

    Conference on Grey Literature and Repositories

    Get PDF

    XTENS-A JSON-Based Digital Repository for Biomedical Data Management

    No full text
    reserved5Biomedical Science poses unique challenges in data management. Heterogeneous information - such as clinical records, biological specimens, imaging and genomic data, different technology-associated formats - must be collected and integrated to provide a unified overview of each patient. International scale research collaborations involve different disciplines (Medicine/Biology, Engineering/IT, Physics,...). Extensive metadata is required to maximize information sharing among the partners. To properly tackle these issues, we have developed XTENS, a data repository built on a flexible and extensible JSON-based data model. The JSON data model is conceived to achieve maximal flexibility, to allow adaptive metadata management, and to perceive metadata as a dynamical process of scientific communication rather than an enduring product fixed in time. XTENS is integrated with iRODS, a data grid software that allows distributed storage, metadata file annotation and advanced policies for data curation. We have adopted the platform for a functional connectomics multicentric project where heterogeneous data sources (radiological images, electroencephalography signals) must be integrated and analysed to compute connectivity maps of the brain. To this end, we have tested the repository prototype allowing the external programs to interact with XTENS using a service-oriented REST interface. We demonstrated XTENS usefulness because we could input heterogeneous data, run the required processing tool and store the process output.mixedL Varesio, IZZO MASSIMILIANO; ARNULFO GABRIELE; FATO MARCO MASSIMO; PIASTRA MARIA CARLA; TEDONE VALENTINAL. Varesio, IZZO MASSIMILIANO; Arnulfo, Gabriele; Fato, MARCO MASSIMO; Piastra, MARIA CARLA; Tedone, Valentin
    corecore