4,233 research outputs found

    MDNet: A Semantically and Visually Interpretable Medical Image Diagnosis Network

    Full text link
    The inability to interpret the model prediction in semantically and visually meaningful ways is a well-known shortcoming of most existing computer-aided diagnosis methods. In this paper, we propose MDNet to establish a direct multimodal mapping between medical images and diagnostic reports that can read images, generate diagnostic reports, retrieve images by symptom descriptions, and visualize attention, to provide justifications of the network diagnosis process. MDNet includes an image model and a language model. The image model is proposed to enhance multi-scale feature ensembles and utilization efficiency. The language model, integrated with our improved attention mechanism, aims to read and explore discriminative image feature descriptions from reports to learn a direct mapping from sentence words to image pixels. The overall network is trained end-to-end by using our developed optimization strategy. Based on a pathology bladder cancer images and its diagnostic reports (BCIDR) dataset, we conduct sufficient experiments to demonstrate that MDNet outperforms comparative baselines. The proposed image model obtains state-of-the-art performance on two CIFAR datasets as well.Comment: CVPR2017 Ora

    Detecting automatically the layout of clinical documents to enhance the performances of downstream natural language processing

    Full text link
    Objective:Develop and validate an algorithm for analyzing the layout of PDF clinical documents to improve the performance of downstream natural language processing tasks. Materials and Methods: We designed an algorithm to process clinical PDF documents and extract only clinically relevant text. The algorithm consists of several steps: initial text extraction using a PDF parser, followed by classification into categories such as body text, left notes, and footers using a Transformer deep neural network architecture, and finally an aggregation step to compile the lines of a given label in the text. We evaluated the technical performance of the body text extraction algorithm by applying it to a random sample of documents that were annotated. Medical performance was evaluated by examining the extraction of medical concepts of interest from the text in their respective sections. Finally, we tested an end-to-end system on a medical use case of automatic detection of acute infection described in the hospital report. Results:Our algorithm achieved per-line precision, recall, and F1 score of 98.4, 97.0, and 97.7, respectively, for body line extraction. The precision, recall, and F1 score per document for the acute infection detection algorithm were 82.54 (95CI 72.86-91.60), 85.24 (95CI 76.61-93.70), 83.87 (95CI 76, 92-90.08) with exploitation of the results of the advanced body extraction algorithm, respectively. Conclusion:We have developed and validated a system for extracting body text from clinical documents in PDF format by identifying their layout. We were able to demonstrate that this preprocessing allowed us to obtain better performances for a common downstream task, i.e., the extraction of medical concepts in their respective sections, thus proving the interest of this method on a clinical use case.Comment: 22 pages, 5 figure

    Building Data-Driven Pathways From Routinely Collected Hospital Data:A Case Study on Prostate Cancer

    Get PDF
    Background: Routinely collected data in hospitals is complex, typically heterogeneous, and scattered across multiple Hospital Information Systems (HIS). This big data, created as a byproduct of health care activities, has the potential to provide a better understanding of diseases, unearth hidden patterns, and improve services and cost. The extent and uses of such data rely on its quality, which is not consistently checked, nor fully understood. Nevertheless, using routine data for the construction of data-driven clinical pathways, describing processes and trends, is a key topic receiving increasing attention in the literature. Traditional algorithms do not cope well with unstructured processes or data, and do not produce clinically meaningful visualizations. Supporting systems that provide additional information, context, and quality assurance inspection are needed. Objective: The objective of the study is to explore how routine hospital data can be used to develop data-driven pathways that describe the journeys that patients take through care, and their potential uses in biomedical research; it proposes a framework for the construction, quality assessment, and visualization of patient pathways for clinical studies and decision support using a case study on prostate cancer. Methods: Data pertaining to prostate cancer patients were extracted from a large UK hospital from eight different HIS, validated, and complemented with information from the local cancer registry. Data-driven pathways were built for each of the 1904 patients and an expert knowledge base, containing rules on the prostate cancer biomarker, was used to assess the completeness and utility of the pathways for a specific clinical study. Software components were built to provide meaningful visualizations for the constructed pathways. Results: The proposed framework and pathway formalism enable the summarization, visualization, and querying of complex patient-centric clinical information, as well as the computation of quality indicators and dimensions. A novel graphical representation of the pathways allows the synthesis of such information. Conclusions: Clinical pathways built from routinely collected hospital data can unearth information about patients and diseases that may otherwise be unavailable or overlooked in hospitals. Data-driven clinical pathways allow for heterogeneous data (ie, semistructured and unstructured data) to be collated over a unified data model and for data quality dimensions to be assessed. This work has enabled further research on prostate cancer and its biomarkers, and on the development and application of methods to mine, compare, analyze, and visualize pathways constructed from routine data. This is an important development for the reuse of big data in hospitals

    A Semantic Cross-Species Derived Data Management Application

    Full text link
    Managing dynamic information in large multi-site, multi-species, and multi-discipline consortia is a challenging task for data management applications. Often in academic research studies the goals for informatics teams are to build applications that provide extract-transform-load (ETL) functionality to archive and catalog source data that has been collected by the research teams. In consortia that cross species and methodological or scientific domains, building interfaces that supply data in a usable fashion and make intuitive sense to scientists from dramatically different backgrounds increases the complexity for developers. Further, reusing source data from outside one's scientific domain is fraught with ambiguities in understanding the data types, analysis methodologies, and how to combine the data with those from other research teams. We report on the design, implementation, and performance of a semantic data management application to support the NIMH funded Conte Center at the University of California, Irvine. The Center is testing a theory of the consequences of "fragmented" (unpredictable, high entropy) early-life experiences on adolescent cognitive and emotional outcomes in both humans and rodents. It employs cross-species neuroimaging, epigenomic, molecular, and neuroanatomical approaches in humans and rodents to assess the potential consequences of fragmented unpredictable experience on brain structure and circuitry. To address this multi-technology, multi-species approach, the system uses semantic web techniques based on the Neuroimaging Data Model (NIDM) to facilitate data ETL functionality. We find this approach enables a low-cost, easy to maintain, and semantically meaningful information management system, enabling the diverse research teams to access and use the data

    Short- and Long-Term Changes in Attention, Memory and Brain Activity Following Exercise, Motor Learning, and Expertise

    Get PDF
    How humans perceive, embody, and execute actions has been an area of intense study in cognitive neuroscience, and these investigations shed light on how we adaptively learn from and interact with an ever-changing world. All of the knowledge associated with action, including sensorimotor representations, the words we use to describe them, and the memories that store this information, are represented in distributed brain regions that comprise knowledge schemas. With repeated practice or training, one can acquire a highly specialized motor repertoire that fosters even more efficient and adaptable behaviour to achieve peak performance. Using behavioural, EEG and fMRI approaches, I will present a series of investigations that evaluate the impact of short-term exercise and long-term dance practice on the development of expert knowledge schemas. In Chapters 2 and 3, I will demonstrate that activating one domain in the schema (e.g., action processing) will prime other domains (e.g., verbal attention and working memory) to induce translational performance improvements. Subsequent chapters will reveal how familiarity with a specific genre of dance influences behavioural (Chapter 3) and neurophysiological signatures of action perception, how these motor representations are coded in sensorimotor association areas (Chapters 4), and how they change with repeated practice and performance (Chapter 5). How these findings contribute to our model of expert knowledge schemas will be discussed in Chapter 6. These findings bear efficacy for the therapeutic application of exercise and dance programs to alleviate motor, cognitive and neurophysiological impairments in several clinical populations, including people with Parkinsons disease

    Quantitative analysis with machine learning models for multi-parametric brain imaging data

    Get PDF
    Gliomas are considered to be the most common primary adult malignant brain tumor. With the dramatic increases in computational power and improvements in image analysis algorithms, computer-aided medical image analysis has been introduced into clinical applications. Precision tumor grading and genotyping play an indispensable role in clinical diagnosis, treatment and prognosis. Gliomas diagnostic procedures include histopathological imaging tests, molecular imaging scans and tumor grading. Pathologic review of tumor morphology in histologic sections is the traditional method for cancer classification and grading, yet human study has limitations that can result in low reproducibility and inter-observer agreement. Compared with histopathological images, Magnetic resonance (MR) imaging present the different structure and functional features, which might serve as noninvasive surrogates for tumor genotypes. Therefore, computer-aided image analysis has been adopted in clinical application, which might partially overcome these shortcomings due to its capacity to quantitatively and reproducibly measure multilevel features on multi-parametric medical information. Imaging features obtained from a single modal image do not fully represent the disease, so quantitative imaging features, including morphological, structural, cellular and molecular level features, derived from multi-modality medical images should be integrated into computer-aided medical image analysis. The image quality differentiation between multi-modality images is a challenge in the field of computer-aided medical image analysis. In this thesis, we aim to integrate the quantitative imaging data obtained from multiple modalities into mathematical models of tumor prediction response to achieve additional insights into practical predictive value. Our major contributions in this thesis are: 1. Firstly, to resolve the imaging quality difference and observer-dependent in histological image diagnosis, we proposed an automated machine-learning brain tumor-grading platform to investigate contributions of multi-parameters from multimodal data including imaging parameters or features from Whole Slide Images (WSI) and the proliferation marker KI-67. For each WSI, we extract both visual parameters such as morphology parameters and sub-visual parameters including first-order and second-order features. A quantitative interpretable machine learning approach (Local Interpretable Model-Agnostic Explanations) was followed to measure the contribution of features for single case. Most grading systems based on machine learning models are considered “black boxes,” whereas with this system the clinically trusted reasoning could be revealed. The quantitative analysis and explanation may assist clinicians to better understand the disease and accordingly to choose optimal treatments for improving clinical outcomes. 2. Based on the automated brain tumor-grading platform we propose, multimodal Magnetic Resonance Images (MRIs) have been introduced in our research. A new imaging–tissue correlation based approach called RA-PA-Thomics was proposed to predict the IDH genotype. Inspired by the concept of image fusion, we integrate multimodal MRIs and the scans of histopathological images for indirect, fast, and cost saving IDH genotyping. The proposed model has been verified by multiple evaluation criteria for the integrated data set and compared to the results in the prior art. The experimental data set includes public data sets and image information from two hospitals. Experimental results indicate that the model provided improves the accuracy of glioma grading and genotyping

    The Cardiac Atlas Project--An Imaging Database for Computational Modeling and Statistical Atlases of the Heart

    Get PDF
    MOTIVATION: Integrative mathematical and statistical models of cardiac anatomy and physiology can play a vital role in understanding cardiac disease phenotype and planning therapeutic strategies. However, the accuracy and predictive power of such models is dependent upon the breadth and depth of noninvasive imaging datasets. The Cardiac Atlas Project (CAP) has established a large-scale database of cardiac imaging examinations and associated clinical data in order to develop a shareable, web-accessible, structural and functional atlas of the normal and pathological heart for clinical, research and educational purposes. A goal of CAP is to facilitate collaborative statistical analysis of regional heart shape and wall motion and characterize cardiac function among and within population groups. RESULTS: Three main open-source software components were developed: (i) a database with web-interface; (ii) a modeling client for 3D + time visualization and parametric description of shape and motion; and (iii) open data formats for semantic characterization of models and annotations. The database was implemented using a three-tier architecture utilizing MySQL, JBoss and Dcm4chee, in compliance with the DICOM standard to provide compatibility with existing clinical networks and devices. Parts of Dcm4chee were extended to access image specific attributes as search parameters. To date, approximately 3000 de-identified cardiac imaging examinations are available in the database. All software components developed by the CAP are open source and are freely available under the Mozilla Public License Version 1.1 (http://www.mozilla.org/MPL/MPL-1.1.txt)
    • …
    corecore