17 research outputs found

    Foreword

    Get PDF
    The aim of this Workshop is to focus on building and evaluating resources used to facilitate biomedical text mining, including their design, update, delivery, quality assessment, evaluation and dissemination. Key resources of interest are lexical and knowledge repositories (controlled vocabularies, terminologies, thesauri, ontologies) and annotated corpora, including both task-specific resources and repositories reengineered from biomedical or general language resources. Of particular interest is the process of building annotated resources, including designing guidelines and annotation schemas (aiming at both syntactic and semantic interoperability) and relying on language engineering standards. Challenging aspects are updates and evolution management of resources, as well as their documentation, dissemination and evaluation

    Cloud-Based Benchmarking of Medical Image Analysis

    Get PDF
    Medical imagin

    Conceptualization of Computational Modeling Approaches and Interpretation of the Role of Neuroimaging Indices in Pathomechanisms for Pre-Clinical Detection of Alzheimer Disease

    Get PDF
    With swift advancements in next-generation sequencing technologies alongside the voluminous growth of biological data, a diversity of various data resources such as databases and web services have been created to facilitate data management, accessibility, and analysis. However, the burden of interoperability between dynamically growing data resources is an increasingly rate-limiting step in biomedicine, specifically concerning neurodegeneration. Over the years, massive investments and technological advancements for dementia research have resulted in large proportions of unmined data. Accordingly, there is an essential need for intelligent as well as integrative approaches to mine available data and substantiate novel research outcomes. Semantic frameworks provide a unique possibility to integrate multiple heterogeneous, high-resolution data resources with semantic integrity using standardized ontologies and vocabularies for context- specific domains. In this current work, (i) the functionality of a semantically structured terminology for mining pathway relevant knowledge from the literature, called Pathway Terminology System, is demonstrated and (ii) a context-specific high granularity semantic framework for neurodegenerative diseases, known as NeuroRDF, is presented. Neurodegenerative disorders are especially complex as they are characterized by widespread manifestations and the potential for dramatic alterations in disease progression over time. Early detection and prediction strategies through clinical pointers can provide promising solutions for effective treatment of AD. In the current work, we have presented the importance of bridging the gap between clinical and molecular biomarkers to effectively contribute to dementia research. Moreover, we address the need for a formalized framework called NIFT to automatically mine relevant clinical knowledge from the literature for substantiating high-resolution cause-and-effect models

    Recuperação de informação multimodal em repositórios de imagem médica

    Get PDF
    The proliferation of digital medical imaging modalities in hospitals and other diagnostic facilities has created huge repositories of valuable data, often not fully explored. Moreover, the past few years show a growing trend of data production. As such, studying new ways to index, process and retrieve medical images becomes an important subject to be addressed by the wider community of radiologists, scientists and engineers. Content-based image retrieval, which encompasses various methods, can exploit the visual information of a medical imaging archive, and is known to be beneficial to practitioners and researchers. However, the integration of the latest systems for medical image retrieval into clinical workflows is still rare, and their effectiveness still show room for improvement. This thesis proposes solutions and methods for multimodal information retrieval, in the context of medical imaging repositories. The major contributions are a search engine for medical imaging studies supporting multimodal queries in an extensible archive; a framework for automated labeling of medical images for content discovery; and an assessment and proposal of feature learning techniques for concept detection from medical images, exhibiting greater potential than feature extraction algorithms that were pertinently used in similar tasks. These contributions, each in their own dimension, seek to narrow the scientific and technical gap towards the development and adoption of novel multimodal medical image retrieval systems, to ultimately become part of the workflows of medical practitioners, teachers, and researchers in healthcare.A proliferação de modalidades de imagem médica digital, em hospitais, clínicas e outros centros de diagnóstico, levou à criação de enormes repositórios de dados, frequentemente não explorados na sua totalidade. Além disso, os últimos anos revelam, claramente, uma tendência para o crescimento da produção de dados. Portanto, torna-se importante estudar novas maneiras de indexar, processar e recuperar imagens médicas, por parte da comunidade alargada de radiologistas, cientistas e engenheiros. A recuperação de imagens baseada em conteúdo, que envolve uma grande variedade de métodos, permite a exploração da informação visual num arquivo de imagem médica, o que traz benefícios para os médicos e investigadores. Contudo, a integração destas soluções nos fluxos de trabalho é ainda rara e a eficácia dos mais recentes sistemas de recuperação de imagem médica pode ser melhorada. A presente tese propõe soluções e métodos para recuperação de informação multimodal, no contexto de repositórios de imagem médica. As contribuições principais são as seguintes: um motor de pesquisa para estudos de imagem médica com suporte a pesquisas multimodais num arquivo extensível; uma estrutura para a anotação automática de imagens; e uma avaliação e proposta de técnicas de representation learning para deteção automática de conceitos em imagens médicas, exibindo maior potencial do que as técnicas de extração de features visuais outrora pertinentes em tarefas semelhantes. Estas contribuições procuram reduzir as dificuldades técnicas e científicas para o desenvolvimento e adoção de sistemas modernos de recuperação de imagem médica multimodal, de modo a que estes façam finalmente parte das ferramentas típicas dos profissionais, professores e investigadores da área da saúde.Programa Doutoral em Informátic

    Arquiteturas federadas para integração de dados biomédicos

    Get PDF
    Doutoramento Ciências da ComputaçãoThe last decades have been characterized by a continuous adoption of IT solutions in the healthcare sector, which resulted in the proliferation of tremendous amounts of data over heterogeneous systems. Distinct data types are currently generated, manipulated, and stored, in the several institutions where patients are treated. The data sharing and an integrated access to this information will allow extracting relevant knowledge that can lead to better diagnostics and treatments. This thesis proposes new integration models for gathering information and extracting knowledge from multiple and heterogeneous biomedical sources. The scenario complexity led us to split the integration problem according to the data type and to the usage specificity. The first contribution is a cloud-based architecture for exchanging medical imaging services. It offers a simplified registration mechanism for providers and services, promotes remote data access, and facilitates the integration of distributed data sources. Moreover, it is compliant with international standards, ensuring the platform interoperability with current medical imaging devices. The second proposal is a sensor-based architecture for integration of electronic health records. It follows a federated integration model and aims to provide a scalable solution to search and retrieve data from multiple information systems. The last contribution is an open architecture for gathering patient-level data from disperse and heterogeneous databases. All the proposed solutions were deployed and validated in real world use cases.A adoção sucessiva das tecnologias de comunicação e de informação na área da saúde tem permitido um aumento na diversidade e na qualidade dos serviços prestados, mas, ao mesmo tempo, tem gerado uma enorme quantidade de dados, cujo valor científico está ainda por explorar. A partilha e o acesso integrado a esta informação poderá permitir a identificação de novas descobertas que possam conduzir a melhores diagnósticos e a melhores tratamentos clínicos. Esta tese propõe novos modelos de integração e de exploração de dados com vista à extração de conhecimento biomédico a partir de múltiplas fontes de dados. A primeira contribuição é uma arquitetura baseada em nuvem para partilha de serviços de imagem médica. Esta solução oferece um mecanismo de registo simplificado para fornecedores e serviços, permitindo o acesso remoto e facilitando a integração de diferentes fontes de dados. A segunda proposta é uma arquitetura baseada em sensores para integração de registos electrónicos de pacientes. Esta estratégia segue um modelo de integração federado e tem como objetivo fornecer uma solução escalável que permita a pesquisa em múltiplos sistemas de informação. Finalmente, o terceiro contributo é um sistema aberto para disponibilizar dados de pacientes num contexto europeu. Todas as soluções foram implementadas e validadas em cenários reais

    Development of a text mining approach to disease network discovery

    Get PDF
    Scientific literature is one of the major sources of knowledge for systems biology, in the form of papers, patents and other types of written reports. Text mining methods aim at automatically extracting relevant information from the literature. The hypothesis of this thesis was that biological systems could be elucidated by the development of text mining solutions that can automatically extract relevant information from documents. The first objective consisted in developing software components to recognize biomedical entities in text, which is the first step to generate a network about a biological system. To this end, a machine learning solution was developed, which can be trained for specific biological entities using an annotated dataset, obtaining high-quality results. Additionally, a rule-based solution was developed, which can be easily adapted to various types of entities. The second objective consisted in developing an automatic approach to link the recognized entities to a reference knowledge base. A solution based on the PageRank algorithm was developed in order to match the entities to the concepts that most contribute to the overall coherence. The third objective consisted in automatically extracting relations between entities, to generate knowledge graphs about biological systems. Due to the lack of annotated datasets available for this task, distant supervision was employed to train a relation classifier on a corpus of documents and a knowledge base. The applicability of this approach was demonstrated in two case studies: microRNAgene relations for cystic fibrosis, obtaining a network of 27 relations using the abstracts of 51 recently published papers; and cell-cytokine relations for tolerogenic cell therapies, obtaining a network of 647 relations from 3264 abstracts. Through a manual evaluation, the information contained in these networks was determined to be relevant. Additionally, a solution combining deep learning techniques with ontology information was developed, to take advantage of the domain knowledge provided by ontologies. This thesis contributed with several solutions that demonstrate the usefulness of text mining methods to systems biology by extracting domain-specific information from the literature. These solutions make it easier to integrate various areas of research, leading to a better understanding of biological systems

    An Autoethnographic Account of Innovation at the US Department of Veterans Affairs

    Get PDF
    The history of the U.S. Department of Veterans Affairs (VA) health information technology (HIT) has been characterized by both enormous successes and catastrophic failures. While the VA was once hailed as the way to the future of twenty-first-century health care, many programs have been mismanaged, delayed, or flawed, resulting in the waste of hundreds of millions of taxpayer dollars. Since 2015 the U.S. Government Accountability Office (GAO) has designated HIT at the VA as being susceptible to waste, fraud, and mismanagement. The timely central research question I ask in this study is, can healthcare IT at the VA be healed? To address this question, I investigate a HIT case study at the VA Center of Innovation (VACI), originally designed to be the flagship initiative of the open government transformation at the VA. The Open Source Electronic Health Record Alliance (OSEHRA) was designed to promote the open innovation ecosystem public-private-academic partnership. Based on my fifteen years of experience at the VA, I use an autoethnographic methodology to make a significant value-added contribution to understanding and modeling the VA’s approach to innovation. I use several theoretical information system framework models including People, Process, and Technology (PPT), Technology, Organization and Environment (TOE), and Technology Adaptive Model (TAM) and propose a new adaptive theory to understand the inability of VA HIT to innovate. From the perspective of people and culture, I study retaliation against whistleblowers, organization behavioral integrity, and lack of transparency in communications. I examine the VA processes, including the different software development methodologies used, the development and operations process (DevOps) of an open-source application developed at VACI, the Radiology Protocol Tool Recorder (RAPTOR), a Veterans Health Information Systems and Technology Architecture (VistA) radiology workflow module. I find that the VA has chosen to migrate away from inhouse application software and buy commercial software. The impact of these People, Process, and Technology findings are representative of larger systemic failings and are appropriate examples to illustrate systemic issues associated with IT innovation at the VA. This autoethnographic account builds on first-hand project experience and literature-based insights

    Tracking expertise profiles in community-driven and evolving knowledge curation platforms

    Get PDF

    A Life Cycle Approach to the Development and Validation of an Ontology of the U.S. Common Rule (45 C.F.R. § 46)

    Get PDF
    Requirements for the protection of human research subjects stem from directly from federal regulation by the Department of Health and Human Services in Title 45 of the Code of Federal Regulations (C.F.R.) part 46. 15 other federal agencies include subpart A of part 46 verbatim in their own body of regulation. Hence 45 C.F.R. part 46 subpart A has come to be called colloquially the ‘Common Rule.’ Overall motivation for this study began as a desire to facilitate the ethical sharing of biospecimen samples from large biospecimen collections by using ontologies. Previous work demonstrated that in general the informed consent process and subsequent decision making about data and specimen release still relies heavily on paper-based informed consent forms and processes. Consequently, well-validated computable models are needed to provide an enhanced foundation for data sharing. This dissertation describes the development and validation of a Common Rule Ontology (CRO), expressed in the OWL-2 Web Ontology Language, and is intended to provide a computable semantic knowledge model for assessing and representing components of the information artifacts of required as part of regulated research under 45 C.F.R. § 46. I examine if the alignment of this ontology with the Basic Formal Ontology and other ontologies from the Open Biomedical Ontology (OBO) Foundry provide a good fit for the regulatory aspects of the Common Rule Ontology. The dissertation also examines and proposes a new method for ongoing evaluation of ontology such as CRO across the ontology development lifecycle and suggest methods to achieve high quality, validated ontologies. While the CRO is not in itself intended to be a complete solution to the data and specimen sharing problems outlined above, it is intended to produce a well-validated computationally grounded framework upon which others can build. This model can be used in future work to build decision support systems to assist Institutional Review Boards (IRBs), regulatory personnel, honest brokers, tissue bank managers, and other individuals in the decision-making process involving biorepository specimen and data sharing
    corecore