33,423 research outputs found

    AI-based medical e-diagnosis for fast and automatic ventricular volume measurement in patients with normal pressure hydrocephalus

    Get PDF
    Based on CT and MRI images acquired from normal pressure hydrocephalus (NPH) patients, using machine learning methods, we aim to establish a multimodal and high-performance automatic ventricle segmentation method to achieve an efficient and accurate automatic measurement of the ventricular volume. First, we extract the brain CT and MRI images of 143 definite NPH patients. Second, we manually label the ventricular volume (VV) and intracranial volume (ICV). Then, we use the machine learning method to extract features and establish automatic ventricle segmentation model. Finally, we verify the reliability of the model and achieved automatic measurement of VV and ICV. In CT images, the Dice similarity coefficient (DSC), intraclass correlation coefficient (ICC), Pearson correlation, and Bland–Altman analysis of the automatic and manual segmentation result of the VV were 0.95, 0.99, 0.99, and 4.2 ± 2.6, respectively. The results of ICV were 0.96, 0.99, 0.99, and 6.0 ± 3.8, respectively. The whole process takes 3.4 ± 0.3 s. In MRI images, the DSC, ICC, Pearson correlation, and Bland–Altman analysis of the automatic and manual segmentation result of the VV were 0.94, 0.99, 0.99, and 2.0 ± 0.6, respectively. The results of ICV were 0.93, 0.99, 0.99, and 7.9 ± 3.8, respectively. The whole process took 1.9 ± 0.1 s. We have established a multimodal and high-performance automatic ventricle segmentation method to achieve efficient and accurate automatic measurement of the ventricular volume of NPH patients. This can help clinicians quickly and accurately understand the situation of NPH patient's ventricles

    OphGLM: Training an Ophthalmology Large Language-and-Vision Assistant based on Instructions and Dialogue

    Full text link
    Large multimodal language models (LMMs) have achieved significant success in general domains. However, due to the significant differences between medical images and text and general web content, the performance of LMMs in medical scenarios is limited. In ophthalmology, clinical diagnosis relies on multiple modalities of medical images, but unfortunately, multimodal ophthalmic large language models have not been explored to date. In this paper, we study and construct an ophthalmic large multimodal model. Firstly, we use fundus images as an entry point to build a disease assessment and diagnosis pipeline to achieve common ophthalmic disease diagnosis and lesion segmentation. Then, we establish a new ophthalmic multimodal instruction-following and dialogue fine-tuning dataset based on disease-related knowledge data and publicly available real-world medical dialogue. We introduce visual ability into the large language model to complete the ophthalmic large language and vision assistant (OphGLM). Our experimental results demonstrate that the OphGLM model performs exceptionally well, and it has the potential to revolutionize clinical applications in ophthalmology. The dataset, code, and models will be made publicly available at https://github.com/ML-AILab/OphGLM.Comment: OphGLM:The first ophthalmology large language-and-vision assistant based on instructions and dialogu

    DeepImageTranslator V2: Analysis of Multimodal Medical Images using Semantic Segmentation Maps Generated through Deep Learning

    Get PDF
    Introduction: Analysis of multimodal medical images often requires the selection of one or many anatomical regions of interest (ROIs) for extraction of useful statistics. This task can prove laborious when a manual approach is used. We have previously developed a user-friendly software tool for image-to-image translation using deep learning. Therefore, we present herein an update to the DeepImageTranslator V2 software with the addition of a tool for multimodal medical image segmentation analysis (hereby referred to as the MMMISA). Methods: The MMMISA was implemented using the Tkinter library; backend computations were implemented using the Pydicom, Numpy, and OpenCV libraries. We tested our software using 4188 slices from whole-body axial 2-deoxy-2-[18F]-fluoroglucose-position emission tomography/ computed tomography scans ([¹⁸F]-FDG-PET/CT) of 10 patients from the American College of Radiology Imaging Network-Head and Neck Squamous Cell Carcinoma (ACRIN-HNSCC) database. Using the deep learning software DeepImageTranslator, a model was trained with 36 randomly selected CT slices and manually labelled semantic segmentation maps. Utilizing the trained model, all the CT scans of the 10 HNSCC patients were segmented with high accuracy. Segmentation maps generated using the deep convolutional network were then used to measure organ specific [¹⁸F]-FDG uptake. We also compared measurements performed using the MMMISA and those made with manually selected ROIs. Results: The MMMISA is a tool that allows user to select ROIs based on deep learning-generated segmentation maps and to compute accurate statistics for these ROIs based on coregistered multimodal images. We found that organ-specific [¹⁸F]-FDG uptake measured using multiple manually selected ROIs is concordant with whole-tissue measurements made with segmentation maps using the MMMISA tool. Doi: 10.28991/HIJ-2022-03-03-07 Full Text: PD

    Bayesian Transductive Markov Random Fields for Interactive Segmentation in Retinal Disorders

    Get PDF
    In the realm of computer aided diagnosis (CAD) interactive segmentation schemes have been well received by physicians, where the combination of human and machine intelligence can provide improved segmentation efficacy with minimal expert intervention [1-3]. Transductive learning (TL) or semi-supervised learning (SSL) is a suitable framework for learning-based interactive segmentation given the scarce label problem. In this paper we present extended work on Bayesian transduction and regularized conditional mixtures for interactive segmentation [3]. We present a Markov random field model integrating a semi-parametric conditional mixture model within a Bayesian transductive learning and inference setting. The model allows efficient learning and inference in a semi-supervised setting given only minimal approximate label information. Preliminary experimental results on multimodal images of retinal disorders such as drusen, geographic atrophy (GA), and choroidal neovascularisation (CNV) with exudates and subretinal fibrosis show promising segmentation performance

    A variational joint segmentation and registration framework for multimodal images

    Get PDF
    Image segmentation and registration are closely related image processing techniques and often required as simultaneous tasks. In this work, we introduce an optimization-based approach to a joint registration and segmentation model for multimodal images deformation. The model combines an active contour variational term with mutual information (MI) smoothing fitting term and solves in this way the difficulties of simultaneously performed segmentation and registration models for multimodal images. This combination takes into account the image structure boundaries and the movement of the objects, leading in this way to a robust dynamic scheme that links the object boundaries information that changes over time. Comparison of our model with state of art shows that our method leads to more consistent registrations and accurate results

    Recurrent Multimodal Interaction for Referring Image Segmentation

    Get PDF
    In this paper we are interested in the problem of image segmentation given natural language descriptions, i.e. referring expressions. Existing works tackle this problem by first modeling images and sentences independently and then segment images by combining these two types of representations. We argue that learning word-to-image interaction is more native in the sense of jointly modeling two modalities for the image segmentation task, and we propose convolutional multimodal LSTM to encode the sequential interactions between individual words, visual information, and spatial information. We show that our proposed model outperforms the baseline model on benchmark datasets. In addition, we analyze the intermediate output of the proposed multimodal LSTM approach and empirically explain how this approach enforces a more effective word-to-image interaction.Comment: To appear in ICCV 2017. See http://www.cs.jhu.edu/~cxliu/ for code and supplementary materia

    Model-based segmentation and registration of multimodal medical images

    Get PDF
    Ph.DDOCTOR OF PHILOSOPH
    corecore