9 research outputs found

    Deep Learning of Unified Region, Edge, and Contour Models for Automated Image Segmentation

    Full text link
    Image segmentation is a fundamental and challenging problem in computer vision with applications spanning multiple areas, such as medical imaging, remote sensing, and autonomous vehicles. Recently, convolutional neural networks (CNNs) have gained traction in the design of automated segmentation pipelines. Although CNN-based models are adept at learning abstract features from raw image data, their performance is dependent on the availability and size of suitable training datasets. Additionally, these models are often unable to capture the details of object boundaries and generalize poorly to unseen classes. In this thesis, we devise novel methodologies that address these issues and establish robust representation learning frameworks for fully-automatic semantic segmentation in medical imaging and mainstream computer vision. In particular, our contributions include (1) state-of-the-art 2D and 3D image segmentation networks for computer vision and medical image analysis, (2) an end-to-end trainable image segmentation framework that unifies CNNs and active contour models with learnable parameters for fast and robust object delineation, (3) a novel approach for disentangling edge and texture processing in segmentation networks, and (4) a novel few-shot learning model in both supervised settings and semi-supervised settings where synergies between latent and image spaces are leveraged to learn to segment images given limited training data.Comment: PhD dissertation, UCLA, 202

    Automated MRI Field of View Prescription from Region of Interest Prediction by Intra-stack Attention Neural Network

    Full text link
    Manual prescription of the field of view (FOV) by MRI technologists is variable and prolongs the scanning process. Often, the FOV is too large or crops critical anatomy. We propose a deep-learning framework, trained by radiologists' supervision, for automating FOV prescription. An intra-stack shared feature extraction network and an attention network are used to process a stack of 2D image inputs to generate output scalars defining the location of a rectangular region of interest (ROI). The attention mechanism is used to make the model focus on the small number of informative slices in a stack. Then the smallest FOV that makes the neural network predicted ROI free of aliasing is calculated by an algebraic operation derived from MR sampling theory. We retrospectively collected 595 cases between February 2018 and February 2022. The framework's performance is examined quantitatively with intersection over union (IoU) and pixel error on position, and qualitatively with a reader study. We use the t-test for comparing quantitative results from all models and a radiologist. The proposed model achieves an average IoU of 0.867 and average ROI position error of 9.06 out of 512 pixels on 80 test cases, significantly better (P<0.05) than two baseline models and not significantly different from a radiologist (P>0.12). Finally, the FOV given by the proposed framework achieves an acceptance rate of 92% from an experienced radiologist

    Kecerdasan Buatan dalam Teknologi Kedokteran: Survey Paper

    Get PDF
    Dalam makalah ini, akan diberikan gambaran mengenai penerapan kecerdasan buatan dalam bidang medis, khususnya untuk pembuatan keputusan serta pengklasifikasian dalam ilmu diagnostik berdasarkan gambar biomedis. Beberapa teknologi kecerdasan buatan (AI) terbukti mampu melakukan optimasi klasifikasi gambar biomedis. Studi ini mengumpulkan studi representatif yang menunjukan bagaimana AI digunakan untuk memecahkan masalah pada ilmu diagnostik. Ini juga mengakui metode kecerdasan buatan yang sering digunakan dalam memecahkan masalah pada ilmu diagnostik, seperti metode jaringan syaraf tiruan, support vector machine, pohon keputusan, serta metode particle swarm optimization. Masalah-masalah dalam ilmu diagnostik yang dapat terpecahkan menggunakan metode tersebut diantaranya yaitu analisis tumor otak MRI dan kanker payudara. Berdasarkan hasil survei yang penulis lakukan, untuk metode yang paling efektif dan efisien dalam melakukan diagnosis pada bidang medis adalah metode CNN hanya saja metode CNN membutuhkan data yang cukup besar untuk melakukan klasifikasi

    Optimizing Inference Distribution for Efficient Kidney Tumor Segmentation Using a UNet-PWP Deep-Learning Model with XAI on CT Scan Images

    Get PDF
    Kidney tumors represent a significant medical challenge, characterized by their often-asymptomatic nature and the need for early detection to facilitate timely and effective intervention. Although neural networks have shown great promise in disease prediction, their computational demands have limited their practicality in clinical settings. This study introduces a novel methodology, the UNet-PWP architecture, tailored explicitly for kidney tumor segmentation, designed to optimize resource utilization and overcome computational complexity constraints. A key novelty in our approach is the application of adaptive partitioning, which deconstructs the intricate UNet architecture into smaller submodels. This partitioning strategy reduces computational requirements and enhances the model’s efficiency in processing kidney tumor images. Additionally, we augment the UNet’s depth by incorporating pre-trained weights, therefore significantly boosting its capacity to handle intricate and detailed segmentation tasks. Furthermore, we employ weight-pruning techniques to eliminate redundant zero-weighted parameters, further streamlining the UNet-PWP model without compromising its performance. To rigorously assess the effectiveness of our proposed UNet-PWP model, we conducted a comparative evaluation alongside the DeepLab V3+ model, both trained on the “KiTs 19, 21, and 23” kidney tumor dataset. Our results are optimistic, with the UNet-PWP model achieving an exceptional accuracy rate of 97.01% on both the training and test datasets, surpassing the DeepLab V3+ model in performance. Furthermore, to ensure our model’s results are easily understandable and explainable. We included a fusion of the attention and Grad-CAM XAI methods. This approach provides valuable insights into the decision-making process of our model and the regions of interest that affect its predictions. In the medical field, this interpretability aspect is crucial for healthcare professionals to trust and comprehend the model’s reasoning

    Automatic COVID-19 lung infected region segmentation and measurement using CT-scans images

    Full text link
    History shows that the infectious disease (COVID-19) can stun the world quickly, causing massive losses to health, resulting in a profound impact on the lives of billions of people, from both a safety and an economic perspective, for controlling the COVID-19 pandemic. The best strategy is to provide early intervention to stop the spread of the disease. In general, Computer Tomography (CT) is used to detect tumors in pneumonia, lungs, tuberculosis, emphysema, or other pleura (the membrane covering the lungs) diseases. Disadvantages of CT imaging system are: inferior soft tissue contrast compared to MRI as it is X-ray-based Radiation exposure. Lung CT image segmentation is a necessary initial step for lung image analysis. The main challenges of segmentation algorithms exaggerated due to intensity in-homogeneity, presence of artifacts, and closeness in the gray level of different soft tissue. The goal of this paper is to design and evaluate an automatic tool for automatic COVID-19 Lung Infection segmentation and measurement using chest CT images. The extensive computer simulations show better efficiency and flexibility of this end-to-end learning approach on CT image segmentation with image enhancement comparing to the state of the art segmentation approaches, namely GraphCut, Medical Image Segmentation (MIS), and Watershed. Experiments performed on COVID-CT-Dataset containing (275) CT scans that are positive for COVID-19 and new data acquired from the EL-BAYANE center for Radiology and Medical Imaging. The means of statistical measures obtained using the accuracy, sensitivity, F-measure, precision, MCC, Dice, Jacquard, and specificity are 0.98, 0.73, 0.71, 0.73, 0.71, 0.71, 0.57, 0.99 respectively; which is better than methods mentioned above. The achieved results prove that the proposed approach is more robust, accurate, and straightforward

    Gaze-Based Human-Robot Interaction by the Brunswick Model

    Get PDF
    We present a new paradigm for human-robot interaction based on social signal processing, and in particular on the Brunswick model. Originally, the Brunswick model copes with face-to-face dyadic interaction, assuming that the interactants are communicating through a continuous exchange of non verbal social signals, in addition to the spoken messages. Social signals have to be interpreted, thanks to a proper recognition phase that considers visual and audio information. The Brunswick model allows to quantitatively evaluate the quality of the interaction using statistical tools which measure how effective is the recognition phase. In this paper we cast this theory when one of the interactants is a robot; in this case, the recognition phase performed by the robot and the human have to be revised w.r.t. the original model. The model is applied to Berrick, a recent open-source low-cost robotic head platform, where the gazing is the social signal to be considered
    corecore