Search CORE

182 research outputs found

Harvesting Information from Captions for Weakly Supervised Semantic Segmentation

Author: Banerjee Debayan
Gall Juergen
Sawatzky Johann
Publication venue
Publication date: 16/05/2019
Field of study

Since acquiring pixel-wise annotations for training convolutional neural networks for semantic image segmentation is time-consuming, weakly supervised approaches that only require class tags have been proposed. In this work, we propose another form of supervision, namely image captions as they can be found on the Internet. These captions have two advantages. They do not require additional curation as it is the case for the clean class tags used by current weakly supervised approaches and they provide textual context for the classes present in an image. To leverage such textual context, we deploy a multi-modal network that learns a joint embedding of the visual representation of the image and the textual representation of the caption. The network estimates text activation maps (TAMs) for class names as well as compound concepts, i.e. combinations of nouns and their attributes. The TAMs of compound concepts describing classes of interest substantially improve the quality of the estimated class activation maps which are then used to train a network for semantic segmentation. We evaluate our method on the COCO dataset where it achieves state of the art results for weakly supervised image segmentation

arXiv.org e-Print Archive

Crossref

A convolutional autoencoder approach for mining features in cellular electron cryo-tomograms and weakly supervised coarse segmentation

Author: Aggarwal
Bartesaghi
Bartesaghi
Beck
Chen
Collado
Delgado
Frazier
Goodfellow
Grünewald
Jasnin
Kemmerling
LeCun
LeCun
Luengo
Martinez-Sanchez
Martinez-Sanchez
Maulik
Miguel Ricardo Leung
Min
Min
Min Xu
Pedregosa
Pei
Pettersen
Ramachandran
Rigort
Scheres
Tang
Tibshirani
Tosic
Tzviya Zeev-Ben-Mordehai
Wold
Xiangrui Zeng
Xu
Xu
Xu
Publication venue: 'Elsevier BV'
Publication date: 28/12/2017
Field of study

Cellular electron cryo-tomography enables the 3D visualization of cellular organization in the near-native state and at submolecular resolution. However, the contents of cellular tomograms are often complex, making it difficult to automatically isolate different in situ cellular components. In this paper, we propose a convolutional autoencoder-based unsupervised approach to provide a coarse grouping of 3D small subvolumes extracted from tomograms. We demonstrate that the autoencoder can be used for efficient and coarse characterization of features of macromolecular complexes and surfaces, such as membranes. In addition, the autoencoder can be used to detect non-cellular features related to sample preparation and data collection, such as carbon edges from the grid and tomogram boundaries. The autoencoder is also able to detect patterns that may indicate spatial interactions between cellular components. Furthermore, we demonstrate that our autoencoder can be used for weakly supervised semantic segmentation of cellular components, requiring a very small amount of manual annotation.Comment: Accepted by Journal of Structural Biolog

arXiv.org e-Print Archive

Crossref

Utrecht University Repository

Weakly supervised underwater fish segmentation using affinity LCFCN

Author: Laradji Issam H.
Nowrouzezahrai Derek
Rahimi Azghadi Mostafa
Rodriguez Pau
Saleh Alzayat
Vazquez David
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2021
Field of study

Estimating fish body measurements like length, width, and mass has received considerable research due to its potential in boosting productivity in marine and aquaculture applications. Some methods are based on manual collection of these measurements using tools like a ruler which is time consuming and labour intensive. Others rely on fully-supervised segmentation models to automatically acquire these measurements but require collecting per-pixel labels which are also time consuming. It can take up to 2 minutes per fish to acquire accurate segmentation labels. To address this problem, we propose a segmentation model that can efficiently train on images labeled with point-level supervision, where each fish is annotated with a single click. This labeling scheme takes an average of only 1 second per fish. Our model uses a fully convolutional neural network with one branch that outputs per-pixel scores and another that outputs an affinity matrix. These two outputs are aggregated using a random walk to get the final, refined per-pixel output. The whole model is trained end-to-end using the localization-based counting fully convolutional neural network (LCFCN) loss and thus we call our method Affinity-LCFCN (A-LCFCN). We conduct experiments on the DeepFish dataset, which contains several fish habitats from north-eastern Australia. The results show that A-LCFCN outperforms a fully-supervised segmentation model when the annotation budget is fixed. They also show that A-LCFCN achieves better segmentation results than LCFCN and a standard baseline

ResearchOnline at James Cook University

Directory of Open Access Journals

Conditional Diffusion Models for Weakly Supervised Medical Image Segmentation

Author: Chen Yu-Jen
Ho Tsung-Yi
Hu Xinrong
Shi Yiyu
Publication venue
Publication date: 06/06/2023
Field of study

Recent advances in denoising diffusion probabilistic models have shown great success in image synthesis tasks. While there are already works exploring the potential of this powerful tool in image semantic segmentation, its application in weakly supervised semantic segmentation (WSSS) remains relatively under-explored. Observing that conditional diffusion models (CDM) is capable of generating images subject to specific distributions, in this work, we utilize category-aware semantic information underlied in CDM to get the prediction mask of the target object with only image-level annotations. More specifically, we locate the desired class by approximating the derivative of the output of CDM w.r.t the input condition. Our method is different from previous diffusion model methods with guidance from an external classifier, which accumulates noises in the background during the reconstruction process. Our method outperforms state-of-the-art CAM and diffusion model methods on two public medical image segmentation datasets, which demonstrates that CDM is a promising tool in WSSS. Also, experiment shows our method is more time-efficient than existing diffusion model methods, making it practical for wider applications

arXiv.org e-Print Archive