9 research outputs found
Structure-Consistent Weakly Supervised Salient Object Detection with Local Saliency Coherence
Sparse labels have been attracting much attention in recent years. However,
the performance gap between weakly supervised and fully supervised salient
object detection methods is huge, and most previous weakly supervised works
adopt complex training methods with many bells and whistles. In this work, we
propose a one-round end-to-end training approach for weakly supervised salient
object detection via scribble annotations without pre/post-processing
operations or extra supervision data. Since scribble labels fail to offer
detailed salient regions, we propose a local coherence loss to propagate the
labels to unlabeled regions based on image features and pixel distance, so as
to predict integral salient regions with complete object structures. We design
a saliency structure consistency loss as self-consistent mechanism to ensure
consistent saliency maps are predicted with different scales of the same image
as input, which could be viewed as a regularization technique to enhance the
model generalization ability. Additionally, we design an aggregation module
(AGGM) to better integrate high-level features, low-level features and global
context information for the decoder to aggregate various information. Extensive
experiments show that our method achieves a new state-of-the-art performance on
six benchmarks (e.g. for the ECSSD dataset: F_\beta = 0.8995, E_\xi = 0.9079
and MAE = 0.0489$), with an average gain of 4.60\% for F-measure, 2.05\% for
E-measure and 1.88\% for MAE over the previous best method on this task. Source
code is available at http://github.com/siyueyu/SCWSSOD.Comment: Accepted by AAAI202
On Creating Benchmark Dataset for Aerial Image Interpretation: Reviews, Guidances and Million-AID
The past years have witnessed great progress on remote sensing (RS) image
interpretation and its wide applications. With RS images becoming more
accessible than ever before, there is an increasing demand for the automatic
interpretation of these images. In this context, the benchmark datasets serve
as essential prerequisites for developing and testing intelligent
interpretation algorithms. After reviewing existing benchmark datasets in the
research community of RS image interpretation, this article discusses the
problem of how to efficiently prepare a suitable benchmark dataset for RS image
interpretation. Specifically, we first analyze the current challenges of
developing intelligent algorithms for RS image interpretation with bibliometric
investigations. We then present the general guidances on creating benchmark
datasets in efficient manners. Following the presented guidances, we also
provide an example on building RS image dataset, i.e., Million-AID, a new
large-scale benchmark dataset containing a million instances for RS image scene
classification. Several challenges and perspectives in RS image annotation are
finally discussed to facilitate the research in benchmark dataset construction.
We do hope this paper will provide the RS community an overall perspective on
constructing large-scale and practical image datasets for further research,
especially data-driven ones
NuClick : a deep learning framework for interactive segmentation of microscopic images
Object segmentation is an important step in the workflow of computational pathology. Deep learning based models generally require large amount of labeled data for precise and reliable prediction. However, collecting labeled data is expensive because it often requires expert knowledge, particularly in medical imaging domain where labels are the result of a time-consuming analysis made by one or more human experts. As nuclei, cells and glands are fundamental objects for downstream analysis in computational pathology/cytology, in this paper we propose NuClick, a CNN-based approach to speed up collecting annotations for these objects requiring minimum interaction from the annotator. We show that for nuclei and cells in histology and cytology images, one click inside each object is enough for NuClick to yield a precise annotation. For multicellular structures such as glands, we propose a novel approach to provide the NuClick with a squiggle as a guiding signal, enabling it to segment the glandular boundaries. These supervisory signals are fed to the network as auxiliary inputs along with RGB channels. With detailed experiments, we show that NuClick is applicable to a wide range of object scales, robust against variations in the user input, adaptable to new domains, and delivers reliable annotations. An instance segmentation model trained on masks generated by NuClick achieved the first rank in LYON19 challenge. As exemplar outputs of our framework, we are releasing two datasets: 1) a dataset of lymphocyte annotations within IHC images, and 2) a dataset of segmented WBCs in blood smear images
Working with scarce annotations in computational pathology
Computational pathology is the study of algorithms and approaches that facilitate the process of diagnosis and prognosis of primarily from digital pathology. The automated methods presented in computational pathology decrease the inter and intra-observability in diagnosis and make the workflow of pathologists more efficient. Digital slide scanners have enabled the digitization of tissue slides and generating whole slide images (WSIs), allowing them to be viewed on a computer screen rather than through a microscope. Digital pathology images present an opportunity for development of new algorithms to automatically analyse the tissue characteristics.
In this thesis, we first focus on the development of automated approaches for detection and segmentation of nuclei. In this regard, for nuclear detection, each nucleus is considered as a Gaussian shape where the mean of Gaussian determines the centroids of nuclei. We investigate the application of mixture density networks for detection of nuclei in the histology images.
We also propose a convolutional neural network (CNN) for instance seg mentation of nuclei. The CNN uses the nuclei spatial information as the target to separate the clustered nuclei. Pixels of each nucleus are replaced with the spatial information of that nucleus. The CNN also utilises dense blocks to reduce number of parameters and positional information at different layer of the network to better learn the spatial information embedded in ground truth.
Two chapters of this thesis are dedicated to dealing with lack of annotations in computational pathology. To this end, we propose a method named as NuClick to generate high quality segmentations for glands and nuclei. NuClick is an interactive CNN based method, that requires minimum user interaction for collecting annotations. We show that one click inside a nucleus can be enough to delineate its boundaries. Moreover, for glands that are more complex and larger objects a squiggle can extract their precise outline.
In another chapter, we propose Self-Path, a method for semi-supervised learning and domain alignment. The main contribution of this chapter is proposing self-supervised tasks that are specific to histology domain and can be extremely helpful when there are not enough annotations for training deep models. One of these self-supervised tasks is predicting the magnification puzzle which is the first domain specific self-supervised task shown to be helpful for domain alignment and semi-supervised learning for classification of histology images.
Nuclear localization allows further exploration of digital biomarkers and can serve as a fundamental route to predicting patient outcome. In chapter 6, by focusing on the challenge of weak labels for whole slide images (WSIs) and also utilising the nuclear localisation techniques, we explore the morphological features from patches that are selected by the model and we observe that these features are associated with patient survival
Tap and Shoot Segmentation
We present a new segmentation method that leverages latent photographic information available at the moment of taking pictures. Photography on a portable device is often done by tapping to focus before shooting the picture. This tap-and-shoot interaction for photography not only specifies the region of interest but also yields useful focus/defocus cues for image segmentation. However, most of the previous interactive segmentation methods address the problem of image segmentation in a post-processing scenario without considering the action of taking pictures. We propose a learning-based approach to this new tap-and-shoot scenario of interactive segmentation. The experimental results on various datasets show that, by training a deep convolutional network to integrate the selection and focus/defocus cues, our method can achieve higher segmentation accuracy in comparison with existing interactive segmentation methods