9 research outputs found

    Structure-Consistent Weakly Supervised Salient Object Detection with Local Saliency Coherence

    Full text link
    Sparse labels have been attracting much attention in recent years. However, the performance gap between weakly supervised and fully supervised salient object detection methods is huge, and most previous weakly supervised works adopt complex training methods with many bells and whistles. In this work, we propose a one-round end-to-end training approach for weakly supervised salient object detection via scribble annotations without pre/post-processing operations or extra supervision data. Since scribble labels fail to offer detailed salient regions, we propose a local coherence loss to propagate the labels to unlabeled regions based on image features and pixel distance, so as to predict integral salient regions with complete object structures. We design a saliency structure consistency loss as self-consistent mechanism to ensure consistent saliency maps are predicted with different scales of the same image as input, which could be viewed as a regularization technique to enhance the model generalization ability. Additionally, we design an aggregation module (AGGM) to better integrate high-level features, low-level features and global context information for the decoder to aggregate various information. Extensive experiments show that our method achieves a new state-of-the-art performance on six benchmarks (e.g. for the ECSSD dataset: F_\beta = 0.8995, E_\xi = 0.9079 and MAE = 0.0489$), with an average gain of 4.60\% for F-measure, 2.05\% for E-measure and 1.88\% for MAE over the previous best method on this task. Source code is available at http://github.com/siyueyu/SCWSSOD.Comment: Accepted by AAAI202

    On Creating Benchmark Dataset for Aerial Image Interpretation: Reviews, Guidances and Million-AID

    Get PDF
    The past years have witnessed great progress on remote sensing (RS) image interpretation and its wide applications. With RS images becoming more accessible than ever before, there is an increasing demand for the automatic interpretation of these images. In this context, the benchmark datasets serve as essential prerequisites for developing and testing intelligent interpretation algorithms. After reviewing existing benchmark datasets in the research community of RS image interpretation, this article discusses the problem of how to efficiently prepare a suitable benchmark dataset for RS image interpretation. Specifically, we first analyze the current challenges of developing intelligent algorithms for RS image interpretation with bibliometric investigations. We then present the general guidances on creating benchmark datasets in efficient manners. Following the presented guidances, we also provide an example on building RS image dataset, i.e., Million-AID, a new large-scale benchmark dataset containing a million instances for RS image scene classification. Several challenges and perspectives in RS image annotation are finally discussed to facilitate the research in benchmark dataset construction. We do hope this paper will provide the RS community an overall perspective on constructing large-scale and practical image datasets for further research, especially data-driven ones

    NuClick : a deep learning framework for interactive segmentation of microscopic images

    Get PDF
    Object segmentation is an important step in the workflow of computational pathology. Deep learning based models generally require large amount of labeled data for precise and reliable prediction. However, collecting labeled data is expensive because it often requires expert knowledge, particularly in medical imaging domain where labels are the result of a time-consuming analysis made by one or more human experts. As nuclei, cells and glands are fundamental objects for downstream analysis in computational pathology/cytology, in this paper we propose NuClick, a CNN-based approach to speed up collecting annotations for these objects requiring minimum interaction from the annotator. We show that for nuclei and cells in histology and cytology images, one click inside each object is enough for NuClick to yield a precise annotation. For multicellular structures such as glands, we propose a novel approach to provide the NuClick with a squiggle as a guiding signal, enabling it to segment the glandular boundaries. These supervisory signals are fed to the network as auxiliary inputs along with RGB channels. With detailed experiments, we show that NuClick is applicable to a wide range of object scales, robust against variations in the user input, adaptable to new domains, and delivers reliable annotations. An instance segmentation model trained on masks generated by NuClick achieved the first rank in LYON19 challenge. As exemplar outputs of our framework, we are releasing two datasets: 1) a dataset of lymphocyte annotations within IHC images, and 2) a dataset of segmented WBCs in blood smear images

    Working with scarce annotations in computational pathology

    Get PDF
    Computational pathology is the study of algorithms and approaches that facilitate the process of diagnosis and prognosis of primarily from digital pathology. The automated methods presented in computational pathology decrease the inter and intra-observability in diagnosis and make the workflow of pathologists more efficient. Digital slide scanners have enabled the digitization of tissue slides and generating whole slide images (WSIs), allowing them to be viewed on a computer screen rather than through a microscope. Digital pathology images present an opportunity for development of new algorithms to automatically analyse the tissue characteristics. In this thesis, we first focus on the development of automated approaches for detection and segmentation of nuclei. In this regard, for nuclear detection, each nucleus is considered as a Gaussian shape where the mean of Gaussian determines the centroids of nuclei. We investigate the application of mixture density networks for detection of nuclei in the histology images. We also propose a convolutional neural network (CNN) for instance seg mentation of nuclei. The CNN uses the nuclei spatial information as the target to separate the clustered nuclei. Pixels of each nucleus are replaced with the spatial information of that nucleus. The CNN also utilises dense blocks to reduce number of parameters and positional information at different layer of the network to better learn the spatial information embedded in ground truth. Two chapters of this thesis are dedicated to dealing with lack of annotations in computational pathology. To this end, we propose a method named as NuClick to generate high quality segmentations for glands and nuclei. NuClick is an interactive CNN based method, that requires minimum user interaction for collecting annotations. We show that one click inside a nucleus can be enough to delineate its boundaries. Moreover, for glands that are more complex and larger objects a squiggle can extract their precise outline. In another chapter, we propose Self-Path, a method for semi-supervised learning and domain alignment. The main contribution of this chapter is proposing self-supervised tasks that are specific to histology domain and can be extremely helpful when there are not enough annotations for training deep models. One of these self-supervised tasks is predicting the magnification puzzle which is the first domain specific self-supervised task shown to be helpful for domain alignment and semi-supervised learning for classification of histology images. Nuclear localization allows further exploration of digital biomarkers and can serve as a fundamental route to predicting patient outcome. In chapter 6, by focusing on the challenge of weak labels for whole slide images (WSIs) and also utilising the nuclear localisation techniques, we explore the morphological features from patches that are selected by the model and we observe that these features are associated with patient survival

    Tap and Shoot Segmentation

    No full text
    We present a new segmentation method that leverages latent photographic information available at the moment of taking pictures. Photography on a portable device is often done by tapping to focus before shooting the picture. This tap-and-shoot interaction for photography not only specifies the region of interest but also yields useful focus/defocus cues for image segmentation. However, most of the previous interactive segmentation methods address the problem of image segmentation in a post-processing scenario without considering the action of taking pictures. We propose a learning-based approach to this new tap-and-shoot scenario of interactive segmentation. The experimental results on various datasets show that, by training a deep convolutional network to integrate the selection and focus/defocus cues, our method can achieve higher segmentation accuracy in comparison with existing interactive segmentation methods
    corecore