423 research outputs found

    Robust Estimation of Motion Parameters and Scene Geometry : Minimal Solvers and Convexification of Regularisers for Low-Rank Approximation

    Get PDF
    In the dawning age of autonomous driving, accurate and robust tracking of vehicles is a quintessential part. This is inextricably linked with the problem of Simultaneous Localisation and Mapping (SLAM), in which one tries to determine the position of a vehicle relative to its surroundings without prior knowledge of them. The more you know about the object you wish to track—through sensors or mechanical construction—the more likely you are to get good positioning estimates. In the first part of this thesis, we explore new ways of improving positioning for vehicles travelling on a planar surface. This is done in several different ways: first, we generalise the work done for monocular vision to include two cameras, we propose ways of speeding up the estimation time with polynomial solvers, and we develop an auto-calibration method to cope with radially distorted images, without enforcing pre-calibration procedures.We continue to investigate the case of constrained motion—this time using auxiliary data from inertial measurement units (IMUs) to improve positioning of unmanned aerial vehicles (UAVs). The proposed methods improve the state-of-the-art for partially calibrated cases (with unknown focal length) for indoor navigation. Furthermore, we propose the first-ever real-time compatible minimal solver for simultaneous estimation of radial distortion profile, focal length, and motion parameters while utilising the IMU data.In the third and final part of this thesis, we develop a bilinear framework for low-rank regularisation, with global optimality guarantees under certain conditions. We also show equivalence between the linear and the bilinear framework, in the sense that the objectives are equal. This enables users of alternating direction method of multipliers (ADMM)—or other subgradient or splitting methods—to transition to the new framework, while being able to enjoy the benefits of second order methods. Furthermore, we propose a novel regulariser fusing two popular methods. This way we are able to combine the best of two worlds by encouraging bias reduction while enforcing low-rank solutions

    A data-fusion approach to motion-stereo

    Get PDF
    This paper introduces a novel method for performing motion--stereo, based on dynamic integration of depth (or its proxy) measures obtained by pairwise stereo matching of video frames. The focus is on the data fusion issue raised by the motion--stereo approach, which is solved within a Kalman filtering framework. Integration occurs along the temporal and spatial dimension, so that the final measure for a pixel results from the combination of measures of the same pixel in time and whose of its neighbors. The method has been validated on both synthetic and natural images, using the simplest stereo matching strategy and a range of different confidence measures, and has been compared to baseline and optimal strategies

    RCDN -- Robust X-Corner Detection Algorithm based on Advanced CNN Model

    Full text link
    Accurate detection and localization of X-corner on both planar and non-planar patterns is a core step in robotics and machine vision. However, previous works could not make a good balance between accuracy and robustness, which are both crucial criteria to evaluate the detectors performance. To address this problem, in this paper we present a novel detection algorithm which can maintain high sub-pixel precision on inputs under multiple interference, such as lens distortion, extreme poses and noise. The whole algorithm, adopting a coarse-to-fine strategy, contains a X-corner detection network and three post-processing techniques to distinguish the correct corner candidates, as well as a mixed sub-pixel refinement technique and an improved region growth strategy to recover the checkerboard pattern partially visible or occluded automatically. Evaluations on real and synthetic images indicate that the presented algorithm has the higher detection rate, sub-pixel accuracy and robustness than other commonly used methods. Finally, experiments of camera calibration and pose estimation verify it can also get smaller re-projection error in quantitative comparisons to the state-of-the-art.Comment: 15 pages, 8 figures and 4 tables. Unpublished further research and experiments of Checkerboard corner detection network CCDN (arXiv:2302.05097) and application exploration for robust camera calibration (https://ieeexplore.ieee.org/abstract/document/9428389

    Asymmetric Transfer of Task Dependent Perceptual Learning in Visual Motion Processing

    Get PDF
    The effects of perceptual learning (PL) on the sensory representation are not fully understood, especially for higher–level visual mechanisms more directly relevant to behavior. The objective of this research is to elucidate the mechanisms that mediate task dependent learning by determining where and how task dependent learning occurs in the later stages of visual motion processing. Eighteen subjects were trained to perform a dual–2TAFC visual discrimination task in which they were required to simultaneously detect changes in the direction of moving dots (task–1) and the proportion of red dots (task–2) shown in two stimulus apertures presented in either the left or right visual field. Subjects trained on the direction discrimination task for one of two types of motion, global radial motions (expansion and contraction) presented across stimulus apertures (global task), or an equivalent (local) motion stimulus formed by rotating the direction of motion in one aperture by 180°. In task–1 subjects were required to indicate whether the directions of motion in the second stimulus interval were rotated clockwise or counter–clockwise relative to the first stimulus interval. In task–2, designed to control for the spatial allocation of attention, subjects were required to indicate which stimulus interval contained a larger proportion of red dots across stimulus apertures. Sixteen of the eighteen subjects showed significant improvement on the trained tasks across sessions (p\u3c0.05). In subjects trained with radial motions, performance improvements transferred to the radial motions presented in the untrained visual field, and the equivalent local motion stimuli and untrained circular motions presented in the trained visual field. For subjects trained with local motion stimuli, learning was restricted to the trained local motion directions and their global motion equivalents presented in the trained visual field. These results suggest that perceptual learning of global and local motions is not symmetric, differentially impacting processing across multiple stages of visual processing whose activities are correlated. This pattern of learning is not fully coherent with a reverse hierarchy theory or bottom–up model of learning, suggesting instead a mechanism whereby learning occurs at the stage of visual processing that is most discriminative for the given task

    Statistical Approaches to Inferring Object Shape from Single Images

    Get PDF
    Depth inference is a fundamental problem of computer vision with a broad range of potential applications. Monocular depth inference techniques, particularly shape from shading dates back to as early as the 40's when it was first used to study the shape of the lunar surface. Since then there has been ample research to develop depth inference algorithms using monocular cues. Most of these are based on physical models of image formation and rely on a number of simplifying assumptions that do not hold for real world and natural imagery. Very few make use of the rich statistical information contained in real world images and their 3D information. There have been a few notable exceptions though. The study of statistics of natural scenes has been concentrated on outdoor scenes which are cluttered. Statistics of scenes of single objects has been less studied, but is an essential part of daily human interaction with the environment. Inferring shape of single objects is a very important computer vision problem which has captured the interest of many researchers over the past few decades and has applications in object recognition, robotic grasping, fault detection and Content Based Image Retrieval (CBIR). This thesis focuses on studying the statistical properties of single objects and their range images which can benefit shape inference techniques. I acquired two databases: Single Object Range and HDR (SORH) and the Eton Myers Database of single objects, including laser-acquired depth, binocular stereo, photometric stereo and High Dynamic Range (HDR) photography. I took a data driven approach and studied the statistics of color and range images of real scenes of single objects along with whole 3D objects and uncovered some interesting trends in the data. The fractal structure of natural images was previously well known, and thought to be a universal property. However, my research showed that the fractal structure of single objects and surfaces is governed by a wholly different set of rules. Classical computer vision problems of binocular and multi-view stereo, photometric stereo, shape from shading, structure from motion, and others, all rely on accurate and complete models of which 3D shapes and textures are plausible in nature, to avoid producing unlikely outputs. Bayesian approaches are common for these problems, and hopefully the findings on the statistics of the shape of single objects from this work and others will both inform new and more accurate Bayesian priors on shape, and also enable more efficient probabilistic inference procedures

    M-FUSE: Multi-frame Fusion for Scene Flow Estimation

    Full text link
    Recently, neural network for scene flow estimation show impressive results on automotive data such as the KITTI benchmark. However, despite of using sophisticated rigidity assumptions and parametrizations, such networks are typically limited to only two frame pairs which does not allow them to exploit temporal information. In our paper we address this shortcoming by proposing a novel multi-frame approach that considers an additional preceding stereo pair. To this end, we proceed in two steps: Firstly, building upon the recent RAFT-3D approach, we develop an advanced two-frame baseline by incorporating an improved stereo method. Secondly, and even more importantly, exploiting the specific modeling concepts of RAFT-3D, we propose a U-Net like architecture that performs a fusion of forward and backward flow estimates and hence allows to integrate temporal information on demand. Experiments on the KITTI benchmark do not only show that the advantages of the improved baseline and the temporal fusion approach complement each other, they also demonstrate that the computed scene flow is highly accurate. More precisely, our approach ranks second overall and first for the even more challenging foreground objects, in total outperforming the original RAFT-3D method by more than 16%. Code is available at https://github.com/cv-stuttgart/M-FUSE

    The interplay between movement and perception: how interaction can influence sensorimotor performance and neuromotor recovery

    Get PDF
    openMovement and perception interact continuously in daily activities. Motor output changes the outside world and affect perceptual representations. Similarly, perception has consequences on movement. Nevertheless, how movement and perception influence each other and share information is still an open question. Mappings from movement to perceptual outcome and vice versa change continuously throughout life. For example, a cerebrovascular accident (stroke) elicits in the nervous system a complex series of reorganization processes at various levels and with different temporal scales. Functional recovery after a stroke seems to be mediated by use-dependent reorganization of the preserved neural circuitry. The goal of this thesis is to discuss how interaction with the environment can influence the progress of both sensorimotor performance and neuromotor recovery. I investigate how individuals develop an implicit knowledge of the ways motor outputs regularly correlate with changes in sensory inputs, by interacting with the environment and experiencing the perceptual consequences of self-generated movements. Further, I applied this paradigm to model the exercise-based neurorehabilitation in stroke survivors, which aims at gradually improving both perceptual and motor performance through repeated exercise. The scientific findings of this thesis indicate that motor learning resolve visual perceptual uncertainty and contributes to persistent changes in visual and somatosensory perception. Moreover, computational neurorehabilitation may help to identify the underlying mechanisms of both motor and perceptual recovery, and may lead to more personalized therapies.openXXXII CICLO - BIOINGEGNERIA E ROBOTICA - BIOENGINEERING AND ROBOTICS - Bioengineering and bioelectronicsSedda, Giuli

    From light rays to 3D models

    Get PDF
    • …
    corecore