115,603 research outputs found

    SeUNet-Trans: A Simple yet Effective UNet-Transformer Model for Medical Image Segmentation

    Full text link
    Automated medical image segmentation is becoming increasingly crucial to modern clinical practice, driven by the growing demand for precise diagnosis, the push towards personalized treatment plans, and the advancements in machine learning algorithms, especially the incorporation of deep learning methods. While convolutional neural networks (CNN) have been prevalent among these methods, the remarkable potential of Transformer-based models for computer vision tasks is gaining more acknowledgment. To harness the advantages of both CNN-based and Transformer-based models, we propose a simple yet effective UNet-Transformer (seUNet-Trans) model for medical image segmentation. In our approach, the UNet model is designed as a feature extractor to generate multiple feature maps from the input images, then the maps are propagated into a bridge layer, which is introduced to sequentially connect the UNet and the Transformer. In this stage, we approach the pixel-level embedding technique without position embedding vectors, aiming to make the model more efficient. Moreover, we apply spatial-reduction attention in the Transformer to reduce the computational/memory overhead. By leveraging the UNet architecture and the self-attention mechanism, our model not only retains the preservation of both local and global context information but also is capable of capturing long-range dependencies between input elements. The proposed model is extensively experimented on seven medical image segmentation datasets including polyp segmentation to demonstrate its efficacy. Comparison with several state-of-the-art segmentation models on these datasets shows the superior performance of our proposed seUNet-Trans network

    Learning Morphological Feature Perturbations for Calibrated Semi-Supervised Segmentation

    Get PDF
    We propose MisMatch, a novel consistency-driven semi-supervised segmentation framework which produces predictions that are invariant to learnt feature perturbations. MisMatch consists of an encoder and a two-head decoders. One decoder learns positive attention to the foreground regions of interest (RoI) on unlabelled images thereby generating dilated features. The other decoder learns negative attention to the foreground on the same unlabelled images thereby generating eroded features. We then apply a consistency regularisation on the paired predictions. MisMatch outperforms state-of-the-art semi-supervised methods on a CT-based pulmonary vessel segmentation task and a MRI-based brain tumour segmentation task. In addition, we show that the effectiveness of MisMatch comes from better model calibration than its supervised learning counterpart

    A goal-driven unsupervised image segmentation method combining graph-based processing and Markov random fields

    Get PDF
    Image segmentation is the process of partitioning a digital image into a set of homogeneous regions (according to some homogeneity criterion) to facilitate a subsequent higher-level analysis. In this context, the present paper proposes an unsupervised and graph-based method of image segmentation, which is driven by an application goal, namely, the generation of image segments associated with a user-defined and application-specific goal. A graph, together with a random grid of source elements, is defined on top of the input image. From each source satisfying a goal-driven predicate, called seed, a propagation algorithm assigns a cost to each pixel on the basis of similarity and topological connectivity, measuring the degree of association with the reference seed. Then, the set of most significant regions is automatically extracted and used to estimate a statistical model for each region. Finally, the segmentation problem is expressed in a Bayesian framework in terms of probabilistic Markov random field (MRF) graphical modeling. An ad hoc energy function is defined based on parametric models, a seed-specific spatial feature, a background-specific potential, and local-contextual information. This energy function is minimized through graph cuts and, more specifically, the alpha-beta swap algorithm, yielding the final goal-driven segmentation based on the maximum a posteriori (MAP) decision rule. The proposed method does not require deep a priori knowledge (e.g., labelled datasets), as it only requires the choice of a goal-driven predicate and a suited parametric model for the data. In the experimental validation with both magnetic resonance (MR) and synthetic aperture radar (SAR) images, the method demonstrates robustness, versatility, and applicability to different domains, thus allowing for further analyses guided by the generated product

    CKD-TransBTS: Clinical Knowledge-Driven Hybrid Transformer with Modality-Correlated Cross-Attention for Brain Tumor Segmentation

    Full text link
    Brain tumor segmentation (BTS) in magnetic resonance image (MRI) is crucial for brain tumor diagnosis, cancer management and research purposes. With the great success of the ten-year BraTS challenges as well as the advances of CNN and Transformer algorithms, a lot of outstanding BTS models have been proposed to tackle the difficulties of BTS in different technical aspects. However, existing studies hardly consider how to fuse the multi-modality images in a reasonable manner. In this paper, we leverage the clinical knowledge of how radiologists diagnose brain tumors from multiple MRI modalities and propose a clinical knowledge-driven brain tumor segmentation model, called CKD-TransBTS. Instead of directly concatenating all the modalities, we re-organize the input modalities by separating them into two groups according to the imaging principle of MRI. A dual-branch hybrid encoder with the proposed modality-correlated cross-attention block (MCCA) is designed to extract the multi-modality image features. The proposed model inherits the strengths from both Transformer and CNN with the local feature representation ability for precise lesion boundaries and long-range feature extraction for 3D volumetric images. To bridge the gap between Transformer and CNN features, we propose a Trans&CNN Feature Calibration block (TCFC) in the decoder. We compare the proposed model with five CNN-based models and six transformer-based models on the BraTS 2021 challenge dataset. Extensive experiments demonstrate that the proposed model achieves state-of-the-art brain tumor segmentation performance compared with all the competitors

    The Visvalingam algorithm: metrics, measures and heuristics

    Get PDF
    This paper provides the background necessary for a clear understanding of forthcoming papers relating to the Visvalingam algorithm for line generalisation, for example on the testing and usage of its implementations. It distinguishes the algorithm from implementation-specific issues to explain why it is possible to get inconsistent but equally valid output from different implementations. By tracing relevant developments within the now-disbanded Cartographic Information Systems Research Group (CISRG) of the University of Hull, it explains why a) a partial metric-driven implementation was, and still is, sufficient for many projects but not for others; b) why the Effective Area (EA) is a measure derived from a metric; c) why this measure (EA) may serve as a heuristic indicator for in-line feature segmentation and model-based generalisation; and, d) how metrics may be combined to change the order of point elimination. The issues discussed in this paper also apply to the use of other metrics. It is hoped that the background and guidance provided in this paper will enable others to participate in further research based on the algorithm

    Data-Driven Shape Analysis and Processing

    Full text link
    Data-driven methods play an increasingly important role in discovering geometric, structural, and semantic relationships between 3D shapes in collections, and applying this analysis to support intelligent modeling, editing, and visualization of geometric data. In contrast to traditional approaches, a key feature of data-driven approaches is that they aggregate information from a collection of shapes to improve the analysis and processing of individual shapes. In addition, they are able to learn models that reason about properties and relationships of shapes without relying on hard-coded rules or explicitly programmed instructions. We provide an overview of the main concepts and components of these techniques, and discuss their application to shape classification, segmentation, matching, reconstruction, modeling and exploration, as well as scene analysis and synthesis, through reviewing the literature and relating the existing works with both qualitative and numerical comparisons. We conclude our report with ideas that can inspire future research in data-driven shape analysis and processing.Comment: 10 pages, 19 figure
    corecore