6,030 research outputs found

    Towards structured sharing of raw and derived neuroimaging data across existing resources

    Full text link
    Data sharing efforts increasingly contribute to the acceleration of scientific discovery. Neuroimaging data is accumulating in distributed domain-specific databases and there is currently no integrated access mechanism nor an accepted format for the critically important meta-data that is necessary for making use of the combined, available neuroimaging data. In this manuscript, we present work from the Derived Data Working Group, an open-access group sponsored by the Biomedical Informatics Research Network (BIRN) and the International Neuroimaging Coordinating Facility (INCF) focused on practical tools for distributed access to neuroimaging data. The working group develops models and tools facilitating the structured interchange of neuroimaging meta-data and is making progress towards a unified set of tools for such data and meta-data exchange. We report on the key components required for integrated access to raw and derived neuroimaging data as well as associated meta-data and provenance across neuroimaging resources. The components include (1) a structured terminology that provides semantic context to data, (2) a formal data model for neuroimaging with robust tracking of data provenance, (3) a web service-based application programming interface (API) that provides a consistent mechanism to access and query the data model, and (4) a provenance library that can be used for the extraction of provenance data by image analysts and imaging software developers. We believe that the framework and set of tools outlined in this manuscript have great potential for solving many of the issues the neuroimaging community faces when sharing raw and derived neuroimaging data across the various existing database systems for the purpose of accelerating scientific discovery

    Ontology-based knowledge representation of experiment metadata in biological data mining

    Get PDF
    According to the PubMed resource from the U.S. National Library of Medicine, over 750,000 scientific articles have been published in the ~5000 biomedical journals worldwide in the year 2007 alone. The vast majority of these publications include results from hypothesis-driven experimentation in overlapping biomedical research domains. Unfortunately, the sheer volume of information being generated by the biomedical research enterprise has made it virtually impossible for investigators to stay aware of the latest findings in their domain of interest, let alone to be able to assimilate and mine data from related investigations for purposes of meta-analysis. While computers have the potential for assisting investigators in the extraction, management and analysis of these data, information contained in the traditional journal publication is still largely unstructured, free-text descriptions of study design, experimental application and results interpretation, making it difficult for computers to gain access to the content of what is being conveyed without significant manual intervention. In order to circumvent these roadblocks and make the most of the output from the biomedical research enterprise, a variety of related standards in knowledge representation are being developed, proposed and adopted in the biomedical community. In this chapter, we will explore the current status of efforts to develop minimum information standards for the representation of a biomedical experiment, ontologies composed of shared vocabularies assembled into subsumption hierarchical structures, and extensible relational data models that link the information components together in a machine-readable and human-useable framework for data mining purposes

    Doctor of Philosophy

    Get PDF
    dissertationClinical research plays a vital role in producing knowledge valuable for understanding human disease and improving healthcare quality. Human subject protection is an obligation essential to the clinical research endeavor, much of which is governed by federal regulations and rules. Institutional Review Boards (IRBs) are responsible for overseeing human subject research to protect individuals from harm and to preserve their rights. Researchers are required to submit and maintain an IRB application, which is an important component in the clinical research process that can significantly affect the timeliness and ethical quality of the study. As clinical research has expanded in both volume and scope over recent years, IRBs are facing increasing challenges in providing efficient and effective oversight. The Clinical Research Informatics (CRI) domain has made significant efforts to support various aspects of clinical research through developing information systems and standards. However, information technology use by IRBs has not received much attention from the CRI community. This dissertation project analyzed over 100 IRB application systems currently used at major academic institutions in the United States. The varieties of system types and lack of standardized application forms across institutions are discussed in detail. The need for building an IRB domain analysis model is identified. . iv In this dissertation, I developed an IRB domain analysis model with a special focus on promoting interoperability among CRI systems to streamline the clinical research workflow. The model was evaluated by a comparison with five real-world IRB application systems. Finally, a prototype implementation of the model was demonstrated by the integration of an electronic IRB system with a health data query system. This dissertation project fills a gap in the research of information technology use for the IRB oversight domain. Adoption of the IRB domain analysis model has potential to enhance efficient and high-quality ethics oversight and to streamline the clinical research workflow

    Design of dimensional model for clinical data storage and analysis

    Get PDF
    Current research in the field of Life and Medical Sciences is generating chunk of data on daily basis. It has thus become a necessity to find solutions for efficient storage of this data, trying to correlate and extract knowledge from it. Clinical data generated in Hospitals, Clinics & Diagnostics centers is falling under a similar paradigm. Patient’s records in various hospitals are increasing at an exponential rate, thus adding to the problem of data management and storage. Major problem being faced corresponding to storage, is the varied dimensionality of the data, ranging from images to numerical form. Therefore there is a need for development of efficient data model which can handle this multi-dimensionality data issue and store the data with historical aspect. For the stated problem lying in façade of clinical informatics we propose a clinical dimensional model design which can be used for development of a clinical data mart. The model has been designed keeping in consideration temporal storage of patient's data with respect to all possible clinical parameters which can include both textual and image based data. Availability of said data for each patient can be then used for application of data mining techniques for finding the correlation of all the parameters at the level of individual and population

    Master of Science

    Get PDF
    thesisData quality has become a significant issue in healthcare as large preexisting databases are integrated to provide greater depth for research and process improvement. Large scale data integration exposes and compounds data quality issues latent in source systems. Although the problems related to data quality in transactional databases have been identified and well-addressed, the application of data quality constraints to large scale data repositories has not and requires novel applications of traditional concepts and methodologies. Despite an abundance of data quality theory, tools and software, there is no consensual technique available to guide developers in the identification of data integrity issues and the application of data quality rules in warehouse-type applications. Data quality measures are frequently developed on an ad hoc basis or methods designed to assure data quality in transactional systems are loosely applied to analytic data stores. These measures are inadequate to address the complex data quality issues in large, integrated data repositories particularly in the healthcare domain with its heterogeneous source systems. This study derives a taxonomy of data quality rules from relational database theory. It describes the development and implementation of data quality rules in the Analytic Health Repository at Intermountain Healthcare and situates the data quality rules in the taxonomy. Further, it identifies areas in which more rigorous data quality iv should be explored. This comparison demonstrates the superiority of a structured approach to data quality rule identification

    Controlled vocabularies and semantics in systems biology

    Get PDF
    The use of computational modeling to describe and analyze biological systems is at the heart of systems biology. Model structures, simulation descriptions and numerical results can be encoded in structured formats, but there is an increasing need to provide an additional semantic layer. Semantic information adds meaning to components of structured descriptions to help identify and interpret them unambiguously. Ontologies are one of the tools frequently used for this purpose. We describe here three ontologies created specifically to address the needs of the systems biology community. The Systems Biology Ontology (SBO) provides semantic information about the model components. The Kinetic Simulation Algorithm Ontology (KiSAO) supplies information about existing algorithms available for the simulation of systems biology models, their characterization and interrelationships. The Terminology for the Description of Dynamics (TEDDY) categorizes dynamical features of the simulation results and general systems behavior. The provision of semantic information extends a model's longevity and facilitates its reuse. It provides useful insight into the biology of modeled processes, and may be used to make informed decisions on subsequent simulation experiments

    Personalized Biomedical Data Integration

    Get PDF

    An overview of decision table literature 1982-1995.

    Get PDF
    This report gives an overview of the literature on decision tables over the past 15 years. As much as possible, for each reference, an author supplied abstract, a number of keywords and a classification are provided. In some cases own comments are added. The purpose of these comments is to show where, how and why decision tables are used. The literature is classified according to application area, theoretical versus practical character, year of publication, country or origin (not necessarily country of publication) and the language of the document. After a description of the scope of the interview, classification results and the classification by topic are presented. The main body of the paper is the ordered list of publications with abstract, classification and comments.

    The Chemical Information Ontology: Provenance and Disambiguation for Chemical Data on the Biological Semantic Web

    Get PDF
    Cheminformatics is the application of informatics techniques to solve chemical problems in silico. There are many areas in biology where cheminformatics plays an important role in computational research, including metabolism, proteomics, and systems biology. One critical aspect in the application of cheminformatics in these fields is the accurate exchange of data, which is increasingly accomplished through the use of ontologies. Ontologies are formal representations of objects and their properties using a logic-based ontology language. Many such ontologies are currently being developed to represent objects across all the domains of science. Ontologies enable the definition, classification, and support for querying objects in a particular domain, enabling intelligent computer applications to be built which support the work of scientists both within the domain of interest and across interrelated neighbouring domains. Modern chemical research relies on computational techniques to filter and organise data to maximise research productivity. The objects which are manipulated in these algorithms and procedures, as well as the algorithms and procedures themselves, enjoy a kind of virtual life within computers. We will call these information entities. Here, we describe our work in developing an ontology of chemical information entities, with a primary focus on data-driven research and the integration of calculated properties (descriptors) of chemical entities within a semantic web context. Our ontology distinguishes algorithmic, or procedural information from declarative, or factual information, and renders of particular importance the annotation of provenance to calculated data. The Chemical Information Ontology is being developed as an open collaborative project. More details, together with a downloadable OWL file, are available at http://code.google.com/p/semanticchemistry/ (license: CC-BY-SA)
    corecore