22,790 research outputs found
Adversarially Tuned Scene Generation
Generalization performance of trained computer vision systems that use
computer graphics (CG) generated data is not yet effective due to the concept
of 'domain-shift' between virtual and real data. Although simulated data
augmented with a few real world samples has been shown to mitigate domain shift
and improve transferability of trained models, guiding or bootstrapping the
virtual data generation with the distributions learnt from target real world
domain is desired, especially in the fields where annotating even few real
images is laborious (such as semantic labeling, and intrinsic images etc.). In
order to address this problem in an unsupervised manner, our work combines
recent advances in CG (which aims to generate stochastic scene layouts coupled
with large collections of 3D object models) and generative adversarial training
(which aims train generative models by measuring discrepancy between generated
and real data in terms of their separability in the space of a deep
discriminatively-trained classifier). Our method uses iterative estimation of
the posterior density of prior distributions for a generative graphical model.
This is done within a rejection sampling framework. Initially, we assume
uniform distributions as priors on the parameters of a scene described by a
generative graphical model. As iterations proceed the prior distributions get
updated to distributions that are closer to the (unknown) distributions of
target data. We demonstrate the utility of adversarially tuned scene generation
on two real-world benchmark datasets (CityScapes and CamVid) for traffic scene
semantic labeling with a deep convolutional net (DeepLab). We realized
performance improvements by 2.28 and 3.14 points (using the IoU metric) between
the DeepLab models trained on simulated sets prepared from the scene generation
models before and after tuning to CityScapes and CamVid respectively.Comment: 9 pages, accepted at CVPR 201
One-Class Classification: Taxonomy of Study and Review of Techniques
One-class classification (OCC) algorithms aim to build classification models
when the negative class is either absent, poorly sampled or not well defined.
This unique situation constrains the learning of efficient classifiers by
defining class boundary just with the knowledge of positive class. The OCC
problem has been considered and applied under many research themes, such as
outlier/novelty detection and concept learning. In this paper we present a
unified view of the general problem of OCC by presenting a taxonomy of study
for OCC problems, which is based on the availability of training data,
algorithms used and the application domains applied. We further delve into each
of the categories of the proposed taxonomy and present a comprehensive
literature review of the OCC algorithms, techniques and methodologies with a
focus on their significance, limitations and applications. We conclude our
paper by discussing some open research problems in the field of OCC and present
our vision for future research.Comment: 24 pages + 11 pages of references, 8 figure
Massive MIMO is a Reality -- What is Next? Five Promising Research Directions for Antenna Arrays
Massive MIMO (multiple-input multiple-output) is no longer a "wild" or
"promising" concept for future cellular networks - in 2018 it became a reality.
Base stations (BSs) with 64 fully digital transceiver chains were commercially
deployed in several countries, the key ingredients of Massive MIMO have made it
into the 5G standard, the signal processing methods required to achieve
unprecedented spectral efficiency have been developed, and the limitation due
to pilot contamination has been resolved. Even the development of fully digital
Massive MIMO arrays for mmWave frequencies - once viewed prohibitively
complicated and costly - is well underway. In a few years, Massive MIMO with
fully digital transceivers will be a mainstream feature at both sub-6 GHz and
mmWave frequencies. In this paper, we explain how the first chapter of the
Massive MIMO research saga has come to an end, while the story has just begun.
The coming wide-scale deployment of BSs with massive antenna arrays opens the
door to a brand new world where spatial processing capabilities are
omnipresent. In addition to mobile broadband services, the antennas can be used
for other communication applications, such as low-power machine-type or
ultra-reliable communications, as well as non-communication applications such
as radar, sensing and positioning. We outline five new Massive MIMO related
research directions: Extremely large aperture arrays, Holographic Massive MIMO,
Six-dimensional positioning, Large-scale MIMO radar, and Intelligent Massive
MIMO.Comment: 20 pages, 9 figures, submitted to Digital Signal Processin
Improving Surgical Training Phantoms by Hyperrealism: Deep Unpaired Image-to-Image Translation from Real Surgeries
Current `dry lab' surgical phantom simulators are a valuable tool for
surgeons which allows them to improve their dexterity and skill with surgical
instruments. These phantoms mimic the haptic and shape of organs of interest,
but lack a realistic visual appearance. In this work, we present an innovative
application in which representations learned from real intraoperative
endoscopic sequences are transferred to a surgical phantom scenario. The term
hyperrealism is introduced in this field, which we regard as a novel subform of
surgical augmented reality for approaches that involve real-time object
transfigurations. For related tasks in the computer vision community, unpaired
cycle-consistent Generative Adversarial Networks (GANs) have shown excellent
results on still RGB images. Though, application of this approach to continuous
video frames can result in flickering, which turned out to be especially
prominent for this application. Therefore, we propose an extension of
cycle-consistent GANs, named tempCycleGAN, to improve temporal consistency.The
novel method is evaluated on captures of a silicone phantom for training
endoscopic reconstructive mitral valve procedures. Synthesized videos show
highly realistic results with regard to 1) replacement of the silicone
appearance of the phantom valve by intraoperative tissue texture, while 2)
explicitly keeping crucial features in the scene, such as instruments, sutures
and prostheses. Compared to the original CycleGAN approach, tempCycleGAN
efficiently removes flickering between frames. The overall approach is expected
to change the future design of surgical training simulators since the generated
sequences clearly demonstrate the feasibility to enable a considerably more
realistic training experience for minimally-invasive procedures.Comment: 8 pages, accepted at MICCAI 2018, supplemental material at
https://youtu.be/qugAYpK-Z4
- …