469 research outputs found
Unsupervised Holistic Image Generation from Key Local Patches
We introduce a new problem of generating an image based on a small number of
key local patches without any geometric prior. In this work, key local patches
are defined as informative regions of the target object or scene. This is a
challenging problem since it requires generating realistic images and
predicting locations of parts at the same time. We construct adversarial
networks to tackle this problem. A generator network generates a fake image as
well as a mask based on the encoder-decoder framework. On the other hand, a
discriminator network aims to detect fake images. The network is trained with
three losses to consider spatial, appearance, and adversarial information. The
spatial loss determines whether the locations of predicted parts are correct.
Input patches are restored in the output image without much modification due to
the appearance loss. The adversarial loss ensures output images are realistic.
The proposed network is trained without supervisory signals since no labels of
key parts are required. Experimental results on six datasets demonstrate that
the proposed algorithm performs favorably on challenging objects and scenes.Comment: 16 page
ShapeCodes: Self-Supervised Feature Learning by Lifting Views to Viewgrids
We introduce an unsupervised feature learning approach that embeds 3D shape
information into a single-view image representation. The main idea is a
self-supervised training objective that, given only a single 2D image, requires
all unseen views of the object to be predictable from learned features. We
implement this idea as an encoder-decoder convolutional neural network. The
network maps an input image of an unknown category and unknown viewpoint to a
latent space, from which a deconvolutional decoder can best "lift" the image to
its complete viewgrid showing the object from all viewing angles. Our
class-agnostic training procedure encourages the representation to capture
fundamental shape primitives and semantic regularities in a data-driven
manner---without manual semantic labels. Our results on two widely-used shape
datasets show 1) our approach successfully learns to perform "mental rotation"
even for objects unseen during training, and 2) the learned latent space is a
powerful representation for object recognition, outperforming several existing
unsupervised feature learning methods.Comment: To appear at ECCV 201
Revisiting Rubik's Cube: Self-supervised Learning with Volume-wise Transformation for 3D Medical Image Segmentation
Deep learning highly relies on the quantity of annotated data. However, the
annotations for 3D volumetric medical data require experienced physicians to
spend hours or even days for investigation. Self-supervised learning is a
potential solution to get rid of the strong requirement of training data by
deeply exploiting raw data information. In this paper, we propose a novel
self-supervised learning framework for volumetric medical images. Specifically,
we propose a context restoration task, i.e., Rubik's cube++, to pre-train 3D
neural networks. Different from the existing context-restoration-based
approaches, we adopt a volume-wise transformation for context permutation,
which encourages network to better exploit the inherent 3D anatomical
information of organs. Compared to the strategy of training from scratch,
fine-tuning from the Rubik's cube++ pre-trained weight can achieve better
performance in various tasks such as pancreas segmentation and brain tissue
segmentation. The experimental results show that our self-supervised learning
method can significantly improve the accuracy of 3D deep learning networks on
volumetric medical datasets without the use of extra data.Comment: Accepted by MICCAI 202
Dynamic characterisation of Össur Flex-Run prosthetic feet for a more informed prescription
Background: The current method of prescribing composite Energy Storing and 6 Returning (ESR) feet is subjective and is based only on the amputee’s static body 7 weight/mass. 8 Objectives: The aim is to investigate their unique design features through identifying 9 and analysing their dynamic characteristics, utilising modal analysis, to determine 10 their mode shapes, natural damping and natural frequencies. Full understanding of 11 the dynamic characteristics can inform on how to tune a foot to match an amputee’s 12 gait and body condition. 13 Methods: This paper presents the modal analysis results of the full range of Össur 14 Flex-Run running feet that are commercially available (1LO-9LO). 15 Results: It is shown that both the undamped natural frequency and stiffness increase 16 linearly from the lowest to highest stiffness category of foot. The effect of over-load 17 and under-loading on natural frequencies is also presented. The damping factor for 18 each foot has been experimentally determined and it was found to be ranging 19 between 1.5-2.0%. An analysis of the mode shapes also showed a unique design 20 feature of these feet that is hypothesised to enhance their performance. 21 Conclusions: A better understanding of the feet dynamic characteristics can help to 22 tune the feet to the user’s requirements. 23 (194 words
Self-Supervised Relative Depth Learning for Urban Scene Understanding
As an agent moves through the world, the apparent motion of scene elements is
(usually) inversely proportional to their depth. It is natural for a learning
agent to associate image patterns with the magnitude of their displacement over
time: as the agent moves, faraway mountains don't move much; nearby trees move
a lot. This natural relationship between the appearance of objects and their
motion is a rich source of information about the world. In this work, we start
by training a deep network, using fully automatic supervision, to predict
relative scene depth from single images. The relative depth training images are
automatically derived from simple videos of cars moving through a scene, using
recent motion segmentation techniques, and no human-provided labels. This proxy
task of predicting relative depth from a single image induces features in the
network that result in large improvements in a set of downstream tasks
including semantic segmentation, joint road segmentation and car detection, and
monocular (absolute) depth estimation, over a network trained from scratch. The
improvement on the semantic segmentation task is greater than those produced by
any other automatically supervised methods. Moreover, for monocular depth
estimation, our unsupervised pre-training method even outperforms supervised
pre-training with ImageNet. In addition, we demonstrate benefits from learning
to predict (unsupervised) relative depth in the specific videos associated with
various downstream tasks. We adapt to the specific scenes in those tasks in an
unsupervised manner to improve performance. In summary, for semantic
segmentation, we present state-of-the-art results among methods that do not use
supervised pre-training, and we even exceed the performance of supervised
ImageNet pre-trained models for monocular depth estimation, achieving results
that are comparable with state-of-the-art methods
Classifying the unknown: discovering novel gravitational-wave detector glitches using similarity learning
The observation of gravitational waves from compact binary coalescences by
LIGO and Virgo has begun a new era in astronomy. A critical challenge in making
detections is determining whether loud transient features in the data are
caused by gravitational waves or by instrumental or environmental sources. The
citizen-science project \emph{Gravity Spy} has been demonstrated as an
efficient infrastructure for classifying known types of noise transients
(glitches) through a combination of data analysis performed by both citizen
volunteers and machine learning. We present the next iteration of this project,
using similarity indices to empower citizen scientists to create large data
sets of unknown transients, which can then be used to facilitate supervised
machine-learning characterization. This new evolution aims to alleviate a
persistent challenge that plagues both citizen-science and instrumental
detector work: the ability to build large samples of relatively rare events.
Using two families of transient noise that appeared unexpectedly during LIGO's
second observing run (O2), we demonstrate the impact that the similarity
indices could have had on finding these new glitch types in the Gravity Spy
program
- …