169 research outputs found

    Phase-differencing in stereo vision: solving the localisation problem

    Get PDF
    Complex Gabor filters with phases in quadrature are often used to model even- and odd-symmetric simple cells in the primary visual cortex. In stereo vision, the phase difference between the responses of the left and right views can be used to construct a disparity or depth map. Various constraints can be applied in order to construct smooth maps, but this leads to very imprecise depth transitions. In this theoretical paper we show, by using lines and edges as image primitives, the origin of the localisation problem. We also argue that disparity should be attributed to lines and edges, rather than trying to construct a 3D surface map in cortical area V1. We derive allowable translation ranges which yield correct disparity estimates, both for left-view centered vision and for cyclopean vision

    Optimal Image-Aided Inertial Navigation

    Get PDF
    The utilization of cameras in integrated navigation systems is among the most recent scientific research and high-tech industry development. The research is motivated by the requirement of calibrating off-the-shelf cameras and the fusion of imaging and inertial sensors in poor GNSS environments. The three major contributions of this dissertation are The development of a structureless camera auto-calibration and system calibration algorithm for a GNSS, IMU and stereo camera system. The auto-calibration bundle adjustment utilizes the scale restraint equation, which is free of object coordinates. The number of parameters to be estimated is significantly reduced in comparison with the ones in a self-calibrating bundle adjustment based on the collinearity equations. Therefore, the proposed method is computationally more efficient. The development of a loosely-coupled visual odometry aided inertial navigation algorithm. The fusion of the two sensors is usually performed using a Kalman filter. The pose changes are pairwise time-correlated, i.e. the measurement noise vector at the current epoch is only correlated with the one from the previous epoch. Time-correlated errors are usually modelled by a shaping filter. The shaping filter developed in this dissertation uses Cholesky factors as coefficients derived from the variance and covariance matrices of the measurement noise vectors. Test results with showed that the proposed algorithm performs better than the existing ones and provides more realistic covariance estimates. The development of a tightly-coupled stereo multi-frame aided inertial navigation algorithm for reducing position and orientation drifts. Usually, the image aiding based on the visual odometry uses the tracked features only from a pair of the consecutive image frames. The proposed method integrates the features tracked from multiple overlapped image frames for reducing the position and orientation drifts. The measurement equation is derived from SLAM measurement equation system where the landmark positions in SLAM are algebraically by time-differencing. However, the derived measurements are time-correlated. Through a sequential de-correlation, the Kalman filter measurement update can be performed sequentially and optimally. The main advantages of the proposed algorithm are the reduction of computational requirements when compared to SLAM and a seamless integration into an existing GNSS aided-IMU system

    Luminance, colour, viewpoint and border enhanced disparity energy model

    Get PDF
    The visual cortex is able to extract disparity information through the use of binocular cells. This process is reflected by the Disparity Energy Model, which describes the role and functioning of simple and complex binocular neuron populations, and how they are able to extract disparity. This model uses explicit cell parameters to mathematically determine preferred cell disparities, like spatial frequencies, orientations, binocular phases and receptive field positions. However, the brain cannot access such explicit cell parameters; it must rely on cell responses. In this article, we implemented a trained binocular neuronal population, which encodes disparity information implicitly. This allows the population to learn how to decode disparities, in a similar way to how our visual system could have developed this ability during evolution. At the same time, responses of monocular simple and complex cells can also encode line and edge information, which is useful for refining disparities at object borders. The brain should then be able, starting from a low-level disparity draft, to integrate all information, including colour and viewpoint perspective, in order to propagate better estimates to higher cortical areas.Portuguese Foundation for Science and Technology (FCT); LARSyS FCT [UID/EEA/50009/2013]; EU project NeuroDynamics [FP7-ICT-2009-6, PN: 270247]; FCT project SparseCoding [EXPL/EEI-SII/1982/2013]; FCT PhD grant [SFRH-BD-44941-2008

    Tracking interacting targets in multi-modal sensors

    Get PDF
    PhDObject tracking is one of the fundamental tasks in various applications such as surveillance, sports, video conferencing and activity recognition. Factors such as occlusions, illumination changes and limited field of observance of the sensor make tracking a challenging task. To overcome these challenges the focus of this thesis is on using multiple modalities such as audio and video for multi-target, multi-modal tracking. Particularly, this thesis presents contributions to four related research topics, namely, pre-processing of input signals to reduce noise, multi-modal tracking, simultaneous detection and tracking, and interaction recognition. To improve the performance of detection algorithms, especially in the presence of noise, this thesis investigate filtering of the input data through spatio-temporal feature analysis as well as through frequency band analysis. The pre-processed data from multiple modalities is then fused within Particle filtering (PF). To further minimise the discrepancy between the real and the estimated positions, we propose a strategy that associates the hypotheses and the measurements with a real target, using a Weighted Probabilistic Data Association (WPDA). Since the filtering involved in the detection process reduces the available information and is inapplicable on low signal-to-noise ratio data, we investigate simultaneous detection and tracking approaches and propose a multi-target track-beforedetect Particle filtering (MT-TBD-PF). The proposed MT-TBD-PF algorithm bypasses the detection step and performs tracking in the raw signal. Finally, we apply the proposed multi-modal tracking to recognise interactions between targets in regions within, as well as outside the cameras’ fields of view. The efficiency of the proposed approaches are demonstrated on large uni-modal, multi-modal and multi-sensor scenarios from real world detections, tracking and event recognition datasets and through participation in evaluation campaigns

    Enhancing 3D Autonomous Navigation Through Obstacle Fields: Homogeneous Localisation and Mapping, with Obstacle-Aware Trajectory Optimisation

    Get PDF
    Small flying robots have numerous potential applications, from quadrotors for search and rescue, infrastructure inspection and package delivery to free-flying satellites for assistance activities inside a space station. To enable these applications, a key challenge is autonomous navigation in 3D, near obstacles on a power, mass and computation constrained platform. This challenge requires a robot to perform localisation, mapping, dynamics-aware trajectory planning and control. The current state-of-the-art uses separate algorithms for each component. Here, the aim is for a more homogeneous approach in the search for improved efficiencies and capabilities. First, an algorithm is described to perform Simultaneous Localisation And Mapping (SLAM) with physical, 3D map representation that can also be used to represent obstacles for trajectory planning: Non-Uniform Rational B-Spline (NURBS) surfaces. Termed NURBSLAM, this algorithm is shown to combine the typically separate tasks of localisation and obstacle mapping. Second, a trajectory optimisation algorithm is presented that produces dynamically-optimal trajectories with direct consideration of obstacles, providing a middle ground between path planners and trajectory smoothers. Called the Admissible Subspace TRajectory Optimiser (ASTRO), the algorithm can produce trajectories that are easier to track than the state-of-the-art for flight near obstacles, as shown in flight tests with quadrotors. For quadrotors to track trajectories, a critical component is the differential flatness transformation that links position and attitude controllers. Existing singularities in this transformation are analysed, solutions are proposed and are then demonstrated in flight tests. Finally, a combined system of NURBSLAM and ASTRO are brought together and tested against the state-of-the-art in a novel simulation environment to prove the concept that a single 3D representation can be used for localisation, mapping, and planning

    Plenoptic Signal Processing for Robust Vision in Field Robotics

    Get PDF
    This thesis proposes the use of plenoptic cameras for improving the robustness and simplicity of machine vision in field robotics applications. Dust, rain, fog, snow, murky water and insufficient light can cause even the most sophisticated vision systems to fail. Plenoptic cameras offer an appealing alternative to conventional imagery by gathering significantly more light over a wider depth of field, and capturing a rich 4D light field structure that encodes textural and geometric information. The key contributions of this work lie in exploring the properties of plenoptic signals and developing algorithms for exploiting them. It lays the groundwork for the deployment of plenoptic cameras in field robotics by establishing a decoding, calibration and rectification scheme appropriate to compact, lenslet-based devices. Next, the frequency-domain shape of plenoptic signals is elaborated and exploited by constructing a filter which focuses over a wide depth of field rather than at a single depth. This filter is shown to reject noise, improving contrast in low light and through attenuating media, while mitigating occluders such as snow, rain and underwater particulate matter. Next, a closed-form generalization of optical flow is presented which directly estimates camera motion from first-order derivatives. An elegant adaptation of this "plenoptic flow" to lenslet-based imagery is demonstrated, as well as a simple, additive method for rendering novel views. Finally, the isolation of dynamic elements from a static background is considered, a task complicated by the non-uniform apparent motion caused by a mobile camera. Two elegant closed-form solutions are presented dealing with monocular time-series and light field image pairs. This work emphasizes non-iterative, noise-tolerant, closed-form, linear methods with predictable and constant runtimes, making them suitable for real-time embedded implementation in field robotics applications

    Plenoptic Signal Processing for Robust Vision in Field Robotics

    Get PDF
    This thesis proposes the use of plenoptic cameras for improving the robustness and simplicity of machine vision in field robotics applications. Dust, rain, fog, snow, murky water and insufficient light can cause even the most sophisticated vision systems to fail. Plenoptic cameras offer an appealing alternative to conventional imagery by gathering significantly more light over a wider depth of field, and capturing a rich 4D light field structure that encodes textural and geometric information. The key contributions of this work lie in exploring the properties of plenoptic signals and developing algorithms for exploiting them. It lays the groundwork for the deployment of plenoptic cameras in field robotics by establishing a decoding, calibration and rectification scheme appropriate to compact, lenslet-based devices. Next, the frequency-domain shape of plenoptic signals is elaborated and exploited by constructing a filter which focuses over a wide depth of field rather than at a single depth. This filter is shown to reject noise, improving contrast in low light and through attenuating media, while mitigating occluders such as snow, rain and underwater particulate matter. Next, a closed-form generalization of optical flow is presented which directly estimates camera motion from first-order derivatives. An elegant adaptation of this "plenoptic flow" to lenslet-based imagery is demonstrated, as well as a simple, additive method for rendering novel views. Finally, the isolation of dynamic elements from a static background is considered, a task complicated by the non-uniform apparent motion caused by a mobile camera. Two elegant closed-form solutions are presented dealing with monocular time-series and light field image pairs. This work emphasizes non-iterative, noise-tolerant, closed-form, linear methods with predictable and constant runtimes, making them suitable for real-time embedded implementation in field robotics applications

    Continuous Human Activity Tracking over a Large Area with Multiple Kinect Sensors

    Get PDF
    In recent years, researchers had been inquisitive about the use of technology to enhance the healthcare and wellness of patients with dementia. Dementia symptoms are associated with the decline in thinking skills and memory severe enough to reduce a person’s ability to pay attention and perform daily activities. Progression of dementia can be assessed by monitoring the daily activities of the patients. This thesis encompasses continuous localization and behavioral analysis of patient’s motion pattern over a wide area indoor living space using multiple calibrated Kinect sensors connected over the network. The skeleton data from all the sensor is transferred to the host computer via TCP sockets into Unity software where it is integrated into a single world coordinate system using calibration technique. Multiple cameras are placed with some overlap in the field of view for the successful calibration of the cameras and continuous tracking of the patients. Localization and behavioral data are stored in a CSV file for further analysis
    corecore