22,790 research outputs found

    Adversarially Tuned Scene Generation

    Full text link
    Generalization performance of trained computer vision systems that use computer graphics (CG) generated data is not yet effective due to the concept of 'domain-shift' between virtual and real data. Although simulated data augmented with a few real world samples has been shown to mitigate domain shift and improve transferability of trained models, guiding or bootstrapping the virtual data generation with the distributions learnt from target real world domain is desired, especially in the fields where annotating even few real images is laborious (such as semantic labeling, and intrinsic images etc.). In order to address this problem in an unsupervised manner, our work combines recent advances in CG (which aims to generate stochastic scene layouts coupled with large collections of 3D object models) and generative adversarial training (which aims train generative models by measuring discrepancy between generated and real data in terms of their separability in the space of a deep discriminatively-trained classifier). Our method uses iterative estimation of the posterior density of prior distributions for a generative graphical model. This is done within a rejection sampling framework. Initially, we assume uniform distributions as priors on the parameters of a scene described by a generative graphical model. As iterations proceed the prior distributions get updated to distributions that are closer to the (unknown) distributions of target data. We demonstrate the utility of adversarially tuned scene generation on two real-world benchmark datasets (CityScapes and CamVid) for traffic scene semantic labeling with a deep convolutional net (DeepLab). We realized performance improvements by 2.28 and 3.14 points (using the IoU metric) between the DeepLab models trained on simulated sets prepared from the scene generation models before and after tuning to CityScapes and CamVid respectively.Comment: 9 pages, accepted at CVPR 201

    One-Class Classification: Taxonomy of Study and Review of Techniques

    Full text link
    One-class classification (OCC) algorithms aim to build classification models when the negative class is either absent, poorly sampled or not well defined. This unique situation constrains the learning of efficient classifiers by defining class boundary just with the knowledge of positive class. The OCC problem has been considered and applied under many research themes, such as outlier/novelty detection and concept learning. In this paper we present a unified view of the general problem of OCC by presenting a taxonomy of study for OCC problems, which is based on the availability of training data, algorithms used and the application domains applied. We further delve into each of the categories of the proposed taxonomy and present a comprehensive literature review of the OCC algorithms, techniques and methodologies with a focus on their significance, limitations and applications. We conclude our paper by discussing some open research problems in the field of OCC and present our vision for future research.Comment: 24 pages + 11 pages of references, 8 figure

    Massive MIMO is a Reality -- What is Next? Five Promising Research Directions for Antenna Arrays

    Full text link
    Massive MIMO (multiple-input multiple-output) is no longer a "wild" or "promising" concept for future cellular networks - in 2018 it became a reality. Base stations (BSs) with 64 fully digital transceiver chains were commercially deployed in several countries, the key ingredients of Massive MIMO have made it into the 5G standard, the signal processing methods required to achieve unprecedented spectral efficiency have been developed, and the limitation due to pilot contamination has been resolved. Even the development of fully digital Massive MIMO arrays for mmWave frequencies - once viewed prohibitively complicated and costly - is well underway. In a few years, Massive MIMO with fully digital transceivers will be a mainstream feature at both sub-6 GHz and mmWave frequencies. In this paper, we explain how the first chapter of the Massive MIMO research saga has come to an end, while the story has just begun. The coming wide-scale deployment of BSs with massive antenna arrays opens the door to a brand new world where spatial processing capabilities are omnipresent. In addition to mobile broadband services, the antennas can be used for other communication applications, such as low-power machine-type or ultra-reliable communications, as well as non-communication applications such as radar, sensing and positioning. We outline five new Massive MIMO related research directions: Extremely large aperture arrays, Holographic Massive MIMO, Six-dimensional positioning, Large-scale MIMO radar, and Intelligent Massive MIMO.Comment: 20 pages, 9 figures, submitted to Digital Signal Processin

    Improving Surgical Training Phantoms by Hyperrealism: Deep Unpaired Image-to-Image Translation from Real Surgeries

    Full text link
    Current `dry lab' surgical phantom simulators are a valuable tool for surgeons which allows them to improve their dexterity and skill with surgical instruments. These phantoms mimic the haptic and shape of organs of interest, but lack a realistic visual appearance. In this work, we present an innovative application in which representations learned from real intraoperative endoscopic sequences are transferred to a surgical phantom scenario. The term hyperrealism is introduced in this field, which we regard as a novel subform of surgical augmented reality for approaches that involve real-time object transfigurations. For related tasks in the computer vision community, unpaired cycle-consistent Generative Adversarial Networks (GANs) have shown excellent results on still RGB images. Though, application of this approach to continuous video frames can result in flickering, which turned out to be especially prominent for this application. Therefore, we propose an extension of cycle-consistent GANs, named tempCycleGAN, to improve temporal consistency.The novel method is evaluated on captures of a silicone phantom for training endoscopic reconstructive mitral valve procedures. Synthesized videos show highly realistic results with regard to 1) replacement of the silicone appearance of the phantom valve by intraoperative tissue texture, while 2) explicitly keeping crucial features in the scene, such as instruments, sutures and prostheses. Compared to the original CycleGAN approach, tempCycleGAN efficiently removes flickering between frames. The overall approach is expected to change the future design of surgical training simulators since the generated sequences clearly demonstrate the feasibility to enable a considerably more realistic training experience for minimally-invasive procedures.Comment: 8 pages, accepted at MICCAI 2018, supplemental material at https://youtu.be/qugAYpK-Z4
    corecore