26 research outputs found

    Adversarial Self-Supervised Scene Flow Estimation

    Get PDF
    This work proposes a metric learning approach for self-supervised scene flow estimation. Scene flow estimation is the task of estimating 3D flow vectors for consecutive 3D point clouds. Such flow vectors are fruitful, \eg for recognizing actions, or avoiding collisions. Training a neural network via supervised learning for scene flow is impractical, as this requires manual annotations for each 3D point at each new timestamp for each scene. To that end, we seek for a self-supervised approach, where a network learns a latent metric to distinguish between points translated by flow estimations and the target point cloud. Our adversarial metric learning includes a multi-scale triplet loss on sequences of two-point clouds as well as a cycle consistency loss. Furthermore, we outline a benchmark for self-supervised scene flow estimation: the Scene Flow Sandbox. The benchmark consists of five datasets designed to study individual aspects of flow estimation in progressive order of complexity, from a moving object to real-world scenes. Experimental evaluation on the benchmark shows that our approach obtains state-of-the-art self-supervised scene flow results, outperforming recent neighbor-based approaches. We use our proposed benchmark to expose shortcomings and draw insights on various training setups. We find that our setup captures motion coherence and preserves local geometries. Dealing with occlusions, on the other hand, is still an open challenge.Comment: Published at 3DV 202

    I Bet You Are Wrong: Gambling Adversarial Networks for Structured Semantic Segmentation

    Full text link
    Adversarial training has been recently employed for realizing structured semantic segmentation, in which the aim is to preserve higher-level scene structural consistencies in dense predictions. However, as we show, value-based discrimination between the predictions from the segmentation network and ground-truth annotations can hinder the training process from learning to improve structural qualities as well as disabling the network from properly expressing uncertainties. In this paper, we rethink adversarial training for semantic segmentation and propose to formulate the fake/real discrimination framework with a correct/incorrect training objective. More specifically, we replace the discriminator with a "gambler" network that learns to spot and distribute its budget in areas where the predictions are clearly wrong, while the segmenter network tries to leave no clear clues for the gambler where to bet. Empirical evaluation on two road-scene semantic segmentation tasks shows that not only does the proposed method re-enable expressing uncertainties, it also improves pixel-wise and structure-based metrics.Comment: 13 pages, 8 figure

    The planar two point algorithm

    Get PDF
    Vision-based localization, mapping and navigation is often performed by searching for corresponding image points and estimating the epipolar geometry. It is known that the possible relative poses of a camera mounted on a mobile robot that moves over a planar ground floor, has two degrees of freedom. This report provides insight in the problem of estimating the exact planar robot pose difference using only two image point correspondences. We describe an algorithm which uses this minimal set of correspondences termed the Two-point algorithm. It is shown that sometimes two non-degenerate correspondences do not define a unique relative robot pose, but lead to two possible real solutions. The algorithm is especially useful as hypothesis generator for the well known RANSAC (RANdom SAmple Consensus) method. The algorithm is evaluated using both simulated data and data acquired by a mobile robot equipped with an omnidirectional camera. The improvement over existing methods is analogous to the improvement of the well known Five-point algorithm over other algorithms for general non-planar camer

    IAS

    No full text
    intelligent autonomous system

    From sensors to human spatial concepts: an

    No full text
    annotated data se

    A gradient descent rule for spiking neurons emitting multiple spikes

    No full text
    A supervised learning rule for Spiking Neural Networks (SNNs) is presented that can cope with neurons that spike multiple times. The rule is developed by extending the existing SpikeProp algorithm which could only be used for one spike per neuron. The problem caused by the discontinuity in the spike process is counteracted with a simple but effective rule, which makes the learning process more efficient. Our learning rule is successfully tested on a classification task of Poisson spike trains. We also applied the algorithm on a temporal version of the XOR problem and show that it is possible to learn this classical problem using only one spiking neuron making use of a hairtrigger situation

    From images to rooms

    No full text
    In this paper we start from a set of images obtained by the robot that is moving around in an environment. We present a method to automatically group the images into groups that correspond to convex subspaces in the environment which are related to the human concept of rooms. Pairwise similarities between the images are computed using local features extracted from the images and geometric constraints. The images with the proposed similarity measure can be seen as a graph or in a way a base level dense topological map. From this low level representation the images are grouped using a graph-clustering technique which effectively finds convex spaces in the environment. The method is tested and evaluated on challenging data sets acquired in real home environments. The resulting higher level maps are compared with the maps humans made based on the same data
    corecore