490 research outputs found

    Image-to-Image Translation with Conditional Adversarial Networks

    Full text link
    We investigate conditional adversarial networks as a general-purpose solution to image-to-image translation problems. These networks not only learn the mapping from input image to output image, but also learn a loss function to train this mapping. This makes it possible to apply the same generic approach to problems that traditionally would require very different loss formulations. We demonstrate that this approach is effective at synthesizing photos from label maps, reconstructing objects from edge maps, and colorizing images, among other tasks. Indeed, since the release of the pix2pix software associated with this paper, a large number of internet users (many of them artists) have posted their own experiments with our system, further demonstrating its wide applicability and ease of adoption without the need for parameter tweaking. As a community, we no longer hand-engineer our mapping functions, and this work suggests we can achieve reasonable results without hand-engineering our loss functions either.Comment: Website: https://phillipi.github.io/pix2pix/, CVPR 201

    Étude de contraintes spatiales bas niveau appliquées à la vision par ordinateur

    Full text link
    Thèse numérisée par la Direction des bibliothèques de l'Université de Montréal

    CUDA based Level Set Method for 3D Reconstruction of Fishes from Large Acoustic Data

    Get PDF
    Acoustic images present views of underwater dynamics, even in high depths. With multi-beam echo sounders (SONARs), it is possible to capture series of 2D high resolution acoustic images. 3D reconstruction of the water column and subsequent estimation of fish abundance and fish species identification is highly desirable for planning sustainable fisheries. Main hurdles in analysing acoustic images are the presence of speckle noise and the vast amount of acoustic data. This paper presents a level set formulation for simultaneous fish reconstruction and noise suppression from raw acoustic images. Despite the presence of speckle noise blobs, actual fish intensity values can be distinguished by extremely high values, varying exponentially from the background. Edge detection generally gives excessive false edges that are not reliable. Our approach to reconstruction is based on level set evolution using Mumford-Shah segmentation functional that does not depend on edges in an image. We use the implicit function in conjunction with the image to robustly estimate a threshold for suppressing noise in the image by solving a second differential equation. We provide details of our estimation of suppressing threshold and show its convergence as the evolution proceeds. We also present a GPU based streaming computation of the method using NVIDIA’s CUDA framework to handle large volume data-sets. Our implementation is optimised for memory usage to handle large volumes

    A comprehensive classification of deep learning libraries

    Get PDF
    Deep Learning (DL) networks are composed of multiple processing layers that learn data representations with multiple levels of abstraction. In recent years, DL networks have significantly improved the state-of-the-art across different domains, including speech processing, text mining, pattern recognition, object detection, robotics and big data analytics. Generally, a researcher or practitioner who is planning to use DL networks for the first time faces difficulties in selecting suitable software tools. The present article provides a comprehensive list and taxonomy of current programming languages and software tools that can be utilized for implementation of DL networks. The motivation of this article is hence to create awareness among researchers, especially beginners, regarding the various languages and interfaces that are available to implement deep learning, and to provide a simplified ontological basis for selecting between them

    Matalaulotteisen affordanssiesityksen oppiminen ja tämän hyödyntäminen robottijärjestelmän koulutuksessa

    Get PDF
    The development of data-driven approaches, such as deep learning, has led to the emergence of systems that have achieved human-like performance in wide variety of tasks. For robotic tasks, deep data-driven models are introduced to create adaptive systems without the need of explicitly programming them. These adaptive systems are needed in situations, where task and environment changes remain unforeseen. Convolutional neural networks (CNNs) have become the standard way to process visual data in robotics. End-to-end neural network models that operate the entire control task can perform various complex tasks with little feature engineering. However, the adaptivity of these systems goes hand in hand with the level of variation in the training data. Training end-to-end deep robotic systems requires a lot of domain-, task-, and hardware-specific data, which is often costly to provide. In this work, we propose to tackle this issue by employing a deep neural network with a modular architecture, consisting of separate perception, policy, and trajectory parts. Each part of the system is trained fully on synthetic data or in simulation. The data is exchanged between parts of the system as low-dimensional representations of affordances and trajectories. The performance is then evaluated in a zero-shot transfer scenario using the Franka Panda robotic arm. Results demonstrate that a low-dimensional representation of scene affordances extracted from an RGB image is sufficient to successfully train manipulator policies.Tietopohjaisten oppimismenetelmien etenkin syväoppimisen viimeaikainen kehitys on synnyttänyt järjestelmiä, jotka ovat saavuttaneet ihmistasoisen suorituskyvyn ihmisälyä vaativissa tehtävissä. Syväoppimiseen pohjautuvia robottijärjestelmiä ollaan kehitetty, jotta ympäristön ja tehtävän muutoksiin mukautuvaisempia robotteja voitaisiin ottaa käyttöön. Konvoluutioneuroverkkojen käyttö kuvatiedon käsittelyssä robotiikassa on yleistä. Neuroverkkomallit, jotka käsittelevät anturitietoa ja suorittavat päätöksenteon ja säädön, voivat oppia monimutkaisia tehtäviä ilman käsin tehtyä kehitystyötä. Näiden järjestelmien kyky mukautua ympäristön muutoksiin on kuitenkin suoraan verrannollinen koulutustiedon monimuotoisuuteen. Syväoppimiseen pohjautuva robottijärjestelmä vaatii oppiakseen suuren määrän ympäristö-, tehtävä-, ja laitteisto-ominaista koulutustietoa, mikä joudutaan yleensä kerätä tehottomasti käsin. Tämän työn tarkoitus on esittää ratkaisu yllämainittuun tehottomuuteen. Esittelemme neuroverkkoarkkitehtuurin, joka koostuu kolmesta erillisestä komponentista. Nämä komponentit koulutetaan erikseen ja koulutus ollaan ainoastaan toteutettu simulaatiossa tai synteettisellä tiedolla ilman fyysisen maailman lisäkouluttautumista Ensimmäinen komponentti tuottaa RGB-kuvasta matalaulotteisen affordanssiesityksen. Tämän esityksen pohjalta toinen komponentti tuottaa matalaulotteisten liikerataesityksen. Kolmas komponentti luo tämän esityksen pohjalta täysimittaisen liikeradan teollisuusrobotille. Järjestelmän suorituskykyä arvioidaan fyysisessä ympäristössä ilman lisäkoulutusta Franka Panda -teollisuusrobotilla. Tulokset osoittavat, että kuvatieto voidaan esittää matalaulotteisena affordanssiesityksenä ja tätä esitystä voidaan käyttää säätötehtävän oppimiseen

    Unsupervised color image segmentation using Markov Random Fields Model

    Get PDF
    We propose a novel approach to investigate and implement unsupervised segmentation of color images particularly natural color images. The aim is to devise a robust unsu- pervised segmentation approach that can segment a color textured image accurately. Here, the color and texture information of each individual pixel along with the pixel's spatial relationship within its neighborhood have been considered for producing precise segmentation of color images. Precise segmentation of images has tremendous potential in various application domains like bioinformatics, forensics, security and surveillance, the mining and material industry and medical imaging where subtle information related to color and texture is required to analyze an image accurately. We intend to implement a robust unsupervised segmentation approach for color im- ages using a newly developed multidimensional spatially variant ¯nite mixture model (MSVFMM) using a Markov Random Fields (MRF) model for improving the over- all accuracy in segmentation and Haar wavelet transform for increasing the texture sensitivity of the proposed approach. [...]Master of Computin
    corecore