12,065 research outputs found
Self-supervised Depth Estimation to Regularise Semantic Segmentation in Knee Arthroscopy
Intra-operative automatic semantic segmentation of knee joint structures can
assist surgeons during knee arthroscopy in terms of situational awareness.
However, due to poor imaging conditions (e.g., low texture, overexposure,
etc.), automatic semantic segmentation is a challenging scenario, which
justifies the scarce literature on this topic. In this paper, we propose a
novel self-supervised monocular depth estimation to regularise the training of
the semantic segmentation in knee arthroscopy. To further regularise the depth
estimation, we propose the use of clean training images captured by the stereo
arthroscope of routine objects (presenting none of the poor imaging conditions
and with rich texture information) to pre-train the model. We fine-tune such
model to produce both the semantic segmentation and self-supervised monocular
depth using stereo arthroscopic images taken from inside the knee. Using a data
set containing 3868 arthroscopic images captured during cadaveric knee
arthroscopy with semantic segmentation annotations, 2000 stereo image pairs of
cadaveric knee arthroscopy, and 2150 stereo image pairs of routine objects, we
show that our semantic segmentation regularised by self-supervised depth
estimation produces a more accurate segmentation than a state-of-the-art
semantic segmentation approach modeled exclusively with semantic segmentation
annotation.Comment: 10 pages, 6 figure
4DContrast: Contrastive Learning with Dynamic Correspondences for 3D Scene Understanding
We present a new approach to instill 4D dynamic object priors into learned 3D
representations by unsupervised pre-training. We observe that dynamic movement
of an object through an environment provides important cues about its
objectness, and thus propose to imbue learned 3D representations with such
dynamic understanding, that can then be effectively transferred to improved
performance in downstream 3D semantic scene understanding tasks. We propose a
new data augmentation scheme leveraging synthetic 3D shapes moving in static 3D
environments, and employ contrastive learning under 3D-4D constraints that
encode 4D invariances into the learned 3D representations. Experiments
demonstrate that our unsupervised representation learning results in
improvement in downstream 3D semantic segmentation, object detection, and
instance segmentation tasks, and moreover, notably improves performance in
data-scarce scenarios.Comment: Accepted by ECCV 2022, Video: https://youtu.be/qhGhWZmJq3
A Trie-Structured Bayesian Model for Unsupervised Morphological Segmentation
In this paper, we introduce a trie-structured Bayesian model for unsupervised
morphological segmentation. We adopt prior information from different sources
in the model. We use neural word embeddings to discover words that are
morphologically derived from each other and thereby that are semantically
similar. We use letter successor variety counts obtained from tries that are
built by neural word embeddings. Our results show that using different
information sources such as neural word embeddings and letter successor variety
as prior information improves morphological segmentation in a Bayesian model.
Our model outperforms other unsupervised morphological segmentation models on
Turkish and gives promising results on English and German for scarce resources.Comment: 12 pages, accepted and presented at the CICLING 2017 - 18th
International Conference on Intelligent Text Processing and Computational
Linguistic
Learning Features by Watching Objects Move
This paper presents a novel yet intuitive approach to unsupervised feature
learning. Inspired by the human visual system, we explore whether low-level
motion-based grouping cues can be used to learn an effective visual
representation. Specifically, we use unsupervised motion-based segmentation on
videos to obtain segments, which we use as 'pseudo ground truth' to train a
convolutional network to segment objects from a single frame. Given the
extensive evidence that motion plays a key role in the development of the human
visual system, we hope that this straightforward approach to unsupervised
learning will be more effective than cleverly designed 'pretext' tasks studied
in the literature. Indeed, our extensive experiments show that this is the
case. When used for transfer learning on object detection, our representation
significantly outperforms previous unsupervised approaches across multiple
settings, especially when training data for the target task is scarce.Comment: CVPR 201
PanDA: Panoptic Data Augmentation
The recently proposed panoptic segmentation task presents a significant challenge of image understanding with computer vision by unifying semantic segmentation and instance segmentation tasks. In this paper we present an efficient and novel panoptic data augmentation (PanDA) method which operates exclusively in pixel space, requires no additional data or training, and is computationally cheap to implement. By retraining original state-of-the-art models on PanDA augmented datasets generated with a single frozen set of parameters, we show robust performance gains in panoptic segmentation, instance segmentation, as well as detection across models, backbones, dataset domains, and scales. Finally, the effectiveness of unrealistic-looking training images synthesized by PanDA suggest that one should rethink the need for image realism for efficient data augmentation
- …