75 research outputs found

    A Multicamera System for Gesture Tracking With Three Dimensional Hand Pose Estimation

    Get PDF
    The goal of any visual tracking system is to successfully detect then follow an object of interest through a sequence of images. The difficulty of tracking an object depends on the dynamics, the motion and the characteristics of the object as well as on the environ ment. For example, tracking an articulated, self-occluding object such as a signing hand has proven to be a very difficult problem. The focus of this work is on tracking and pose estimation with applications to hand gesture interpretation. An approach that attempts to integrate the simplicity of a region tracker with single hand 3D pose estimation methods is presented. Additionally, this work delves into the pose estimation problem. This is ac complished by both analyzing hand templates composed of their morphological skeleton, and addressing the skeleton\u27s inherent instability. Ligature points along the skeleton are flagged in order to determine their effect on skeletal instabilities. Tested on real data, the analysis finds the flagging of ligature points to proportionally increase the match strength of high similarity image-template pairs by about 6%. The effectiveness of this approach is further demonstrated in a real-time multicamera hand tracking system that tracks hand gestures through three-dimensional space as well as estimate the three-dimensional pose of the hand

    Hypertemporal Imaging Capability of UAS Improves Photogrammetric Tree Canopy Models

    Get PDF
    Small uncrewed aerial systems (UASs) generate imagery that can provide detailed information regarding condition and change if the products are reproducible through time. Densified point clouds form the basic information for digital surface models and orthorectified mosaics, so variable dense point reconstruction will introduce uncertainty. Eucalyptus trees typically have sparse and discontinuous canopies with pendulous leaves that present a difficult target for photogrammetry software. We examine how spectral band, season, solar azimuth, elevation, and some processing settings impact completeness and reproducibility of dense point clouds for shrub swamp and Eucalyptus forest canopy. At the study site near solar noon, selecting near infrared camera increased projected tree canopy fourfold, and dense point features more than 2 m above ground were increased sixfold compared to red spectral bands. Near infrared (NIR) imagery improved projected and total dense features two- and threefold, respectively, compared to default green band imagery. The lowest solar elevation captured (25°) consistently improved canopy feature reconstruction in all spectral bands. Although low solar elevations are typically avoided for radiometric reasons, we demonstrate that these conditions improve the detection and reconstruction of complex tree canopy features in natural Eucalyptus forests. Combining imagery sets captured at different solar elevations improved the reproducibility of dense point clouds between seasons. Total dense point cloud features reconstructed were increased by almost 10 million points (20%) when imagery used was NIR combining solar noon and low solar elevation imagery. It is possible to use agricultural multispectral camera rigs to reconstruct Eucalyptus tree canopy and shrub swamp by combining imagery and selecting appropriate spectral bands for processin

    Scale-Adaptive Neural Dense Features: Learning via Hierarchical Context Aggregation

    Get PDF
    How do computers and intelligent agents view the world around them? Feature extraction and representation constitutes one the basic building blocks towards answering this question. Traditionally, this has been done with carefully engineered hand-crafted techniques such as HOG, SIFT or ORB. However, there is no ``one size fits all'' approach that satisfies all requirements. In recent years, the rising popularity of deep learning has resulted in a myriad of end-to-end solutions to many computer vision problems. These approaches, while successful, tend to lack scalability and can't easily exploit information learned by other systems. Instead, we propose SAND features, a dedicated deep learning solution to feature extraction capable of providing hierarchical context information. This is achieved by employing sparse relative labels indicating relationships of similarity/dissimilarity between image locations. The nature of these labels results in an almost infinite set of dissimilar examples to choose from. We demonstrate how the selection of negative examples during training can be used to modify the feature space and vary it's properties. To demonstrate the generality of this approach, we apply the proposed features to a multitude of tasks, each requiring different properties. This includes disparity estimation, semantic segmentation, self-localisation and SLAM. In all cases, we show how incorporating SAND features results in better or comparable results to the baseline, whilst requiring little to no additional training. Code can be found at: https://github.com/jspenmar/SAND_featuresComment: CVPR201

    Relative Pose Estimation Using Non-overlapping Multicamera Clusters

    Get PDF
    This thesis considers the Simultaneous Localization and Mapping (SLAM) problem using a set of perspective cameras arranged such that there is no overlap in their fields-of-view. With the known and fixed extrinsic calibration of each camera within the cluster, a novel real-time pose estimation system is presented that is able to accurately track the motion of a camera cluster relative to an unknown target object or environment and concurrently generate a model of the structure, using only image-space measurements. A new parameterization for point feature position using a spherical coordinate update is presented which isolates system parameters dependent on global scale, allowing the shape parameters of the system to converge despite the scale parameters remaining uncertain. Furthermore, a flexible initialization scheme is proposed which allows the optimization to converge accurately using only the measurements from the cameras at the first time step. An analysis is presented identifying the configurations of the cluster motions and target structure geometry for which the optimization solution becomes degenerate and the global scale is ambiguous. Results are presented that not only confirm the previously known critical motions for a two-camera cluster, but also provide a complete description of the degeneracies related to the point feature constellations. The proposed algorithms are implemented and verified in experiments with a camera cluster constructed using multiple perspective cameras mounted on a quadrotor vehicle and augmented with tracking markers to collect high-precision ground-truth motion measurements from an optical indoor positioning system. The accuracy and performance of the proposed pose estimation system are confirmed for various motion profiles in both indoor and challenging outdoor environments

    A Fast and Robust Extrinsic Calibration for RGB-D Camera Networks

    Get PDF
    From object tracking to 3D reconstruction, RGB-Depth (RGB-D) camera networks play an increasingly important role in many vision and graphics applications. Practical applications often use sparsely-placed cameras to maximize visibility, while using as few cameras as possible to minimize cost. In general, it is challenging to calibrate sparse camera networks due to the lack of shared scene features across different camera views. In this paper, we propose a novel algorithm that can accurately and rapidly calibrate the geometric relationships across an arbitrary number of RGB-D cameras on a network. Our work has a number of novel features. First, to cope with the wide separation between different cameras, we establish view correspondences by using a spherical calibration object. We show that this approach outperforms other techniques based on planar calibration objects. Second, instead of modeling camera extrinsic calibration using rigid transformation, which is optimal only for pinhole cameras, we systematically test different view transformation functions including rigid transformation, polynomial transformation and manifold regression to determine the most robust mapping that generalizes well to unseen data. Third, we reformulate the celebrated bundle adjustment procedure to minimize the global 3D reprojection error so as to fine-tune the initial estimates. Finally, our scalable client-server architecture is computationally efficient: the calibration of a five-camera system, including data capture, can be done in minutes using only commodity PCs. Our proposed framework is compared with other state-of-the-arts systems using both quantitative measurements and visual alignment results of the merged point clouds

    3d Object Reconstruction from Multiple Views: A Compressive Sensing Framework

    Get PDF
    Automatized life like representation of natural objects has been a cherished goal for humanity. Towards achieving this goal we propose a novel framework for the reconstruction of the 3D object. We laid the foundation for the representation of the signal by a developing a theory which deals with different sampling techniques(both Uniform and Non-uniform) for signals in the euclidean space, finite element method(Interpolative basis) for signals in the topological domain and finally the compressive sampling using which we can capture and represent the compressible signals by exploiting the sparsity

    Plenoptic Signal Processing for Robust Vision in Field Robotics

    Get PDF
    This thesis proposes the use of plenoptic cameras for improving the robustness and simplicity of machine vision in field robotics applications. Dust, rain, fog, snow, murky water and insufficient light can cause even the most sophisticated vision systems to fail. Plenoptic cameras offer an appealing alternative to conventional imagery by gathering significantly more light over a wider depth of field, and capturing a rich 4D light field structure that encodes textural and geometric information. The key contributions of this work lie in exploring the properties of plenoptic signals and developing algorithms for exploiting them. It lays the groundwork for the deployment of plenoptic cameras in field robotics by establishing a decoding, calibration and rectification scheme appropriate to compact, lenslet-based devices. Next, the frequency-domain shape of plenoptic signals is elaborated and exploited by constructing a filter which focuses over a wide depth of field rather than at a single depth. This filter is shown to reject noise, improving contrast in low light and through attenuating media, while mitigating occluders such as snow, rain and underwater particulate matter. Next, a closed-form generalization of optical flow is presented which directly estimates camera motion from first-order derivatives. An elegant adaptation of this "plenoptic flow" to lenslet-based imagery is demonstrated, as well as a simple, additive method for rendering novel views. Finally, the isolation of dynamic elements from a static background is considered, a task complicated by the non-uniform apparent motion caused by a mobile camera. Two elegant closed-form solutions are presented dealing with monocular time-series and light field image pairs. This work emphasizes non-iterative, noise-tolerant, closed-form, linear methods with predictable and constant runtimes, making them suitable for real-time embedded implementation in field robotics applications

    Plenoptic Signal Processing for Robust Vision in Field Robotics

    Get PDF
    This thesis proposes the use of plenoptic cameras for improving the robustness and simplicity of machine vision in field robotics applications. Dust, rain, fog, snow, murky water and insufficient light can cause even the most sophisticated vision systems to fail. Plenoptic cameras offer an appealing alternative to conventional imagery by gathering significantly more light over a wider depth of field, and capturing a rich 4D light field structure that encodes textural and geometric information. The key contributions of this work lie in exploring the properties of plenoptic signals and developing algorithms for exploiting them. It lays the groundwork for the deployment of plenoptic cameras in field robotics by establishing a decoding, calibration and rectification scheme appropriate to compact, lenslet-based devices. Next, the frequency-domain shape of plenoptic signals is elaborated and exploited by constructing a filter which focuses over a wide depth of field rather than at a single depth. This filter is shown to reject noise, improving contrast in low light and through attenuating media, while mitigating occluders such as snow, rain and underwater particulate matter. Next, a closed-form generalization of optical flow is presented which directly estimates camera motion from first-order derivatives. An elegant adaptation of this "plenoptic flow" to lenslet-based imagery is demonstrated, as well as a simple, additive method for rendering novel views. Finally, the isolation of dynamic elements from a static background is considered, a task complicated by the non-uniform apparent motion caused by a mobile camera. Two elegant closed-form solutions are presented dealing with monocular time-series and light field image pairs. This work emphasizes non-iterative, noise-tolerant, closed-form, linear methods with predictable and constant runtimes, making them suitable for real-time embedded implementation in field robotics applications

    Mobile Robot Manipulator System Design for Localization and Mapping in Cluttered Environments

    Get PDF
    In this thesis, a compact mobile robot has been developed to build real-time 3D maps of hazards and cluttered environments inside damaged buildings for rescue tasks using visual Simultaneous Localization And Mapping (SLAM) algorithms. In order to maximize the survey area in such environments, this mobile robot is designed with four omni-wheels and equipped with a 6 Degree of Freedom (DOF) robotic arm carrying a stereo camera mounted on its end-effector. The aim of using this mobile articulated robotic system is monitor different types of regions within the area of interest, ranging from wide open spaces to smaller and irregular regions behind narrow gaps. In the first part of the thesis, the robot system design is presented in detail, including the kinematic systems of the omni-wheeled mobile platform and the 6-DOF robotic arm, estimation of the biases in parameters of these kinematic systems, the sensors and calibration of their parameters. These parameters are important for the sensor fusion utilized in the next part of the thesis, where two operation modes are proposed to retain the camera pose when the visual SLAM algorithms fail due to variety of the region types. In the second part, an integrated sensor data fusion, odometry and SLAM scheme is developed, where the camera poses are estimated using forward kinematic equations of the robotic arm and fused to the visual SLAM and odometry algorithms. A modified wavefront algorithm with reduced computational complexity is used to find the shortest path to reach the identified goal points. Finally, a dynamic control scheme is developed for path tracking and motion control of the mobile platform and the robot arm, with sub-systems in the form of PD controllers and extended Kalman filters. The overall system design is physically implemented on a prototype integrated mobile robot platform and successfully tested in real-time
    corecore