3 research outputs found

    Segmentation of field grape bunches via an improved pyramid scene parsing network

    Get PDF
    With the continuous expansion of wine grape planting areas, the mechanization and intelligence of grape harvesting have gradually become the future development trend. In order to guide the picking robot to pick grapes more efficiently in the vineyard, this study proposed a grape bunches segmentation method based on Pyramid Scene Parsing Network (PSPNet) deep semantic segmentation network for different varieties of grapes in the natural field environments. To this end, the Convolutional Block Attention Module (CBAM) attention mechanism and the atrous convolution were first embedded in the backbone feature extraction network of the PSPNet model to improve the feature extraction capability. Meanwhile, the proposed model also improved the PSPNet semantic segmentation model by fusing multiple feature layers (with more contextual information) extracted by the backbone network. The improved PSPNet was compared against the original PSPNet on a newly collected grape image dataset, and it was shown that the improved PSPNet model had an Intersection-over-Union (IoU) and Pixel Accuracy (PA) of 87.42% and 95.73%, respectively, implying an improvement of 4.36% and 9.95% over the original PSPNet model. The improved PSPNet was also compared against the state-of-the-art DeepLab-V3+ and U-Net in terms of IoU, PA, computation efficiency and robustness, and showed promising performance. It is concluded that the improved PSPNet can quickly and accurately segment grape bunches of different varieties in the natural field environments, which provides a certain technical basis for intelligent harvesting by grape picking robots

    A transformer-based approach empowered by a self-attention technique for semantic segmentation in remote sensing

    Get PDF
    Semantic segmentation of Remote Sensing (RS) images involves the classification of each pixel in a satellite image into distinct and non-overlapping regions or segments. This task is crucial in various domains, including land cover classification, autonomous driving, and scene understanding. While deep learning has shown promising results, there is limited research that specifically addresses the challenge of processing fine details in RS images while also considering the high computational demands. To tackle this issue, we propose a novel approach that combines convolutional and transformer architectures. Our design incorporates convolutional layers with a low receptive field to generate fine-grained feature maps for small objects in very high-resolution images. On the other hand, transformer blocks are utilized to capture contextual information from the input. By leveraging convolution and self-attention in this manner, we reduce the need for extensive downsampling and enable the network to work with full-resolution features, which is particularly beneficial for handling small objects. Additionally, our approach eliminates the requirement for vast datasets, which is often necessary for purely transformer-based networks. In our experimental results, we demonstrate the effectiveness of our method in generating local and contextual features using convolutional and transformer layers, respectively. Our approach achieves a mean dice score of 80.41%, outperforming other well-known techniques such as UNet, Fully-Connected Network (FCN), Pyramid Scene Parsing Network (PSP Net), and the recent Convolutional vision Transformer (CvT) model, which achieved mean dice scores of 78.57%, 74.57%, 73.45%, and 62.97% respectively, under the same training conditions and using the same training dataset

    Deep-learning-driven quantification of interstitial fibrosis in digitized kidney biopsies

    Full text link
    Interstitial fibrosis and tubular atrophy (IFTA) on a renal biopsy are strong indicators of disease chronicity and prognosis. Techniques that are typically used for IFTA grading remain manual, leading to variability among pathologists. Accurate IFTA estimation using computational techniques can reduce this variability and provide quantitative assessment. Using trichrome-stained whole-slide images (WSIs) processed from human renal biopsies, we developed a deep-learning framework that captured finer pathologic structures at high resolution and overall context at the WSI level to predict IFTA grade. WSIs (n = 67) were obtained from The Ohio State University Wexner Medical Center. Five nephropathologists independently reviewed them and provided fibrosis scores that were converted to IFTA grades: ≤10% (none or minimal), 11% to 25% (mild), 26% to 50% (moderate), and >50% (severe). The model was developed by associating the WSIs with the IFTA grade determined by majority voting (reference estimate). Model performance was evaluated on WSIs (n = 28) obtained from the Kidney Precision Medicine Project. There was good agreement on the IFTA grading between the pathologists and the reference estimate (κ = 0.622 ± 0.071). The accuracy of the deep-learning model was 71.8% ± 5.3% on The Ohio State University Wexner Medical Center and 65.0% ± 4.2% on Kidney Precision Medicine Project data sets. Our approach to analyzing microscopic- and WSI-level changes in renal biopsies attempts to mimic the pathologist and provides a regional and contextual estimation of IFTA. Such methods can assist clinicopathologic diagnosis.U01 DK085660 - NIDDK NIH HHS; RF1 AG062109 - NIA NIH HHS; R21 CA253498 - NCI NIH HHS; R21 DK119751 - NIDDK NIH HHS; R01 HL132325 - NHLBI NIH HHS; UL1 TR001430 - NCATS NIH HHS; R56 AG062109 - NIA NIH HHS; R21 DK119740 - NIDDK NIH HHShttps://www.medrxiv.org/content/10.1101/2021.01.03.21249179v1.full.pd
    corecore