154 research outputs found

    SEINE: Methods for Electronic Data Capture and Integrated Data Repository Synthesis with Patient Registry Use Cases

    Get PDF
    Integrated Data Repositories (IDR) allow clinical research to leverage electronic health records (EHR) and other data sources while Electronic Data Capture (EDC) applications often support manually maintained patient registries. Using i2b2 and REDCap, (IDR and EDC platforms respectively) we have developed methods that integrate IDR and EDC strengths supporting: 1) data delivery from the IDR as ready-to-use registries to exploit the annotation and data collection capabilities unique to EDC applications; 2) integrating EDC managed registries into data repositories allows investigators to use hypothesis generation and cohort discovery methods. This round-trip integration can lower lag between cohort discovery and establishing a registry. Investigators can also periodically augment their registry cohort as the IDR is enriched with additional data elements, data sources, and patients. We describe our open-source automated methods and provide three example registry uses cases for these methods: triple negative breast cancer, vertiginous syndrome, cancer distress

    Implications of observation-fact modifiers to i2b2 ontologies

    Get PDF
    Biomedical translational research can be facilitated by integrating clinical and research data. In particular, study cohort identification and hypothesis generation is enabled by the mining of integrated clinical observations and research resources. The informatics for integrating biology and the bedside, or i2b2, framework is widely used for this biomedical data mining. The i2b2 star schema data model using entity-attribute-value (EA V) formatted concepts is a very efficient strategy for querying large amounts of data. However, until the most recent i2b2 release, the utility of the platform was somewhat constrained by the limitations on being able to express facts about facts - i.e., modify the observations about the patients. We have found that exploiting the new modifier functionality has significantly and favorably impacted the design of i2b2 ontologies, leading to easier and more meaningful query results. Copyright © 2011 IEEE

    An ICT infrastructure to integrate clinical and molecular data in oncology research

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The ONCO-i2b2 platform is a bioinformatics tool designed to integrate clinical and research data and support translational research in oncology. It is implemented by the University of Pavia and the IRCCS Fondazione Maugeri hospital (FSM), and grounded on the software developed by the Informatics for Integrating Biology and the Bedside (i2b2) research center. I2b2 has delivered an open source suite based on a data warehouse, which is efficiently interrogated to find sets of interesting patients through a query tool interface.</p> <p>Methods</p> <p>Onco-i2b2 integrates data coming from multiple sources and allows the users to jointly query them. I2b2 data are then stored in a data warehouse, where facts are hierarchically structured as ontologies. Onco-i2b2 gathers data from the FSM pathology unit (PU) database and from the hospital biobank and merges them with the clinical information from the hospital information system.</p> <p>Our main effort was to provide a robust integrated research environment, giving a particular emphasis to the integration process and facing different challenges, consecutively listed: biospecimen samples privacy and anonymization; synchronization of the biobank database with the i2b2 data warehouse through a series of Extract, Transform, Load (ETL) operations; development and integration of a Natural Language Processing (NLP) module, to retrieve coded information, such as SNOMED terms and malignant tumors (TNM) classifications, and clinical tests results from unstructured medical records. Furthermore, we have developed an internal SNOMED ontology rested on the NCBO BioPortal web services.</p> <p>Results</p> <p>Onco-i2b2 manages data of more than 6,500 patients with breast cancer diagnosis collected between 2001 and 2011 (over 390 of them have at least one biological sample in the cancer biobank), more than 47,000 visits and 96,000 observations over 960 medical concepts.</p> <p>Conclusions</p> <p>Onco-i2b2 is a concrete example of how integrated Information and Communication Technology architecture can be implemented to support translational research. The next steps of our project will involve the extension of its capabilities by implementing new plug-in devoted to bioinformatics data analysis as well as a temporal query module.</p

    Clinical Bioinformatics: challenges and opportunities

    Get PDF
    Background: Network Tools and Applications in Biology (NETTAB) Workshops are a series of meetings focused on the most promising and innovative ICT tools and to their usefulness in Bioinformatics. The NETTAB 2011 workshop, held in Pavia, Italy, in October 2011 was aimed at presenting some of the most relevant methods, tools and infrastructures that are nowadays available for Clinical Bioinformatics (CBI), the research field that deals with clinical applications of bioinformatics. Methods: In this editorial, the viewpoints and opinions of three world CBI leaders, who have been invited to participate in a panel discussion of the NETTAB workshop on the next challenges and future opportunities of this field, are reported. These include the development of data warehouses and ICT infrastructures for data sharing, the definition of standards for sharing phenotypic data and the implementation of novel tools to implement efficient search computing solutions. Results: Some of the most important design features of a CBI-ICT infrastructure are presented, including data warehousing, modularity and flexibility, open-source development, semantic interoperability, integrated search and retrieval of –omics information. Conclusions: Clinical Bioinformatics goals are ambitious. Many factors, including the availability of high-throughput “-omics” technologies and equipment, the widespread availability of clinical data warehouses and the noteworthy increase in data storage and computational power of the most recent ICT systems, justify research and efforts in this domain, which promises to be a crucial leveraging factor for biomedical research

    Collaborative Cloud Computing Framework for Health Data with Open Source Technologies

    Full text link
    The proliferation of sensor technologies and advancements in data collection methods have enabled the accumulation of very large amounts of data. Increasingly, these datasets are considered for scientific research. However, the design of the system architecture to achieve high performance in terms of parallelization, query processing time, aggregation of heterogeneous data types (e.g., time series, images, structured data, among others), and difficulty in reproducing scientific research remain a major challenge. This is specifically true for health sciences research, where the systems must be i) easy to use with the flexibility to manipulate data at the most granular level, ii) agnostic of programming language kernel, iii) scalable, and iv) compliant with the HIPAA privacy law. In this paper, we review the existing literature for such big data systems for scientific research in health sciences and identify the gaps of the current system landscape. We propose a novel architecture for software-hardware-data ecosystem using open source technologies such as Apache Hadoop, Kubernetes and JupyterHub in a distributed environment. We also evaluate the system using a large clinical data set of 69M patients.Comment: This paper is accepted in ACM-BCB 202
    • 

    corecore