234 research outputs found

    Calibration of a Stereo Radiation Detection Camera Using Planar Homography

    Get PDF
    This paper proposes a calibration technique of a stereo gamma detection camera. Calibration of the internal and external parameters of a stereo vision camera is a well-known research problem in the computer vision society. However, few or no stereo calibration has been investigated in the radiation measurement research. Since no visual information can be obtained from a stereo radiation camera, it is impossible to use a general stereo calibration algorithm directly. In this paper, we develop a hybrid-type stereo system which is equipped with both radiation and vision cameras. To calibrate the stereo radiation cameras, stereo images of a calibration pattern captured from the vision cameras are transformed in the view of the radiation cameras. The homography transformation is calibrated based on the geometric relationship between visual and radiation camera coordinates. The accuracy of the stereo parameters of the radiation camera is analyzed by distance measurements to both visual light and gamma sources. The experimental results show that the measurement error is about 3%

    Stereo vision augmented Medipix3 for real-time material discrimination

    Get PDF
    Abstract. A secure and sustainable supply of raw materials, such as minerals and metals, is a major challenge for the European Union. Domestic consumption of minerals and metals substantially exceeds production, leading to a high reliance on imports to meet demand, which is driven by the various sectors of the EU’s economy. This thesis was conducted as a part of the Horizon 2020 funded X-MINE project, which aims to address the issue by combining novel sensing technologies to improve resource characterization and economic feasibility of domestic mining operations. The focus of the thesis is within early stage ore extraction, where a real-time sensing platform can be employed in a way to reduce waste at an early stage, decreasing the environmental footprint generated by downstream processing. The advances in CMOS technology have enabled the development of photon counting hybrid pixel detectors, such as Medipix3, for noise-free, high resolution X-ray imaging. Combined with high-speed readout electronics, Medipix3 can be used for imaging and identifying higher density intrusions within ore samples in real-time. In addition, developments in GPU-accelerated stereo vision and related hardware have resulted in high-speed stereo vision camera systems. The combination of stereo vision and Medipix3 is explored in this thesis for real-time material discrimination purposes. In collaboration with multiple European mines, ore samples with reference elemental data were obtained and measured using the combination of stereo vision and Medipix3. In addition, Medipix3 supports a simultaneous dual channel measurement mode, which is used to form a comparison to the performance of the presented system. Measurement principles and physics for both X-ray imaging and stereo vision are discussed. Additionally, calibration methods and algorithms for both measurement modes are presented. Furthermore, methods for data fusion and algorithm performance evaluation are outlined. Results for both measurement modes are presented, along with the relevant measurement physics. Superior performance is obtained with the augmentation of stereo vision, in part due to adverse effects of high-speed imaging on image quality with X-ray imaging. Dual channel approach requires higher data throughput, which results in a reduced integration time in comparison. Additionally, charge sharing effects due to high resolution reduce the spectral measurement capabilities

    Método para el registro automático de imágenes basado en transformaciones proyectivas planas dependientes de las distancias y orientado a imágenes sin características comunes

    Get PDF
    Tesis inédita de la Universidad Complutense de Madrid, Facultad de Ciencias Físicas, Departamento de Arquitectura de Computadores y Automática, leída el 18-12-2015Multisensory data fusion oriented to image-based application improves the accuracy, quality and availability of the data, and consequently, the performance of robotic systems, by means of combining the information of a scene acquired from multiple and different sources into a unified representation of the 3D world scene, which is more enlightening and enriching for the subsequent image processing, improving either the reliability by using the redundant information, or the capability by taking advantage of complementary information. Image registration is one of the most relevant steps in image fusion techniques. This procedure aims the geometrical alignment of two or more images. Normally, this process relies on feature-matching techniques, which is a drawback for combining sensors that are not able to deliver common features. For instance, in the combination of ToF and RGB cameras, the robust feature-matching is not reliable. Typically, the fusion of these two sensors has been addressed from the computation of the cameras calibration parameters for coordinate transformation between them. As a result, a low resolution colour depth map is provided. For improving the resolution of these maps and reducing the loss of colour information, extrapolation techniques are adopted. A crucial issue for computing high quality and accurate dense maps is the presence of noise in the depth measurement from the ToF camera, which is normally reduced by means of sensor calibration and filtering techniques. However, the filtering methods, implemented for the data extrapolation and denoising, usually over-smooth the data, reducing consequently the accuracy of the registration procedure...La fusión multisensorial orientada a aplicaciones de procesamiento de imágenes, conocida como fusión de imágenes, es una técnica que permite mejorar la exactitud, la calidad y la disponibilidad de datos de un entorno tridimensional, que a su vez permite mejorar el rendimiento y la operatividad de sistemas robóticos. Dicha fusión, se consigue mediante la combinación de la información adquirida por múltiples y diversas fuentes de captura de datos, la cual se agrupa del tal forma que se obtiene una mejor representación del entorno 3D, que es mucho más ilustrativa y enriquecedora para la implementación de métodos de procesamiento de imágenes. Con ello se consigue una mejora en la fiabilidad y capacidad del sistema, empleando la información redundante que ha sido adquirida por múltiples sensores. El registro de imágenes es uno de los procedimientos más importantes que componen la fusión de imágenes. El objetivo principal del registro de imágenes es la consecución de la alineación geométrica entre dos o más imágenes. Normalmente, este proceso depende de técnicas de búsqueda de patrones comunes entre imágenes, lo cual puede ser un inconveniente cuando se combinan sensores que no proporcionan datos con características similares. Un ejemplo de ello, es la fusión de cámaras de color de alta resolución (RGB) con cámaras de Tiempo de Vuelo de baja resolución (Time-of-Flight (ToF)), con las cuales no es posible conseguir una detección robusta de patrones comunes entre las imágenes capturadas por ambos sensores. Por lo general, la fusión entre estas cámaras se realiza mediante el cálculo de los parámetros de calibración de las mismas, que permiten realizar la trasformación homogénea entre ellas. Y como resultado de este xii Abstract procedimiento, se obtienen mapas de profundad y de color de baja resolución. Con el objetivo de mejorar la resolución de estos mapas y de evitar la pérdida de información de color, se utilizan diversas técnicas de extrapolación de datos. Un factor crucial a tomar en cuenta para la obtención de mapas de alta calidad y alta exactitud, es la presencia de ruido en las medidas de profundidad obtenidas por las cámaras ToF. Este problema, normalmente se reduce mediante la calibración de estos sensores y con técnicas de filtrado de datos. Sin embargo, las técnicas de filtrado utilizadas, tanto para la interpolación de datos, como para la reducción del ruido, suelen producir el sobre-alisamiento de los datos originales, lo cual reduce la exactitud del registro de imágenes...Sección Deptal. de Arquitectura de Computadores y Automática (Físicas)Fac. de Ciencias FísicasTRUEunpu

    Thermal-Kinect Fusion Scanning System for Bodyshape Inpainting and Estimation under Clothing

    Get PDF
    In today\u27s interactive world 3D body scanning is necessary in the field of making virtual avatar, apparel industry, physical health assessment and so on. 3D scanners that are used in this process are very costly and also requires subject to be nearly naked or wear a special tight fitting cloths. A cost effective 3D body scanning system which can estimate body parameters under clothing will be the best solution in this regard. In our experiment we build such a body scanning system by fusing Kinect depth sensor and a Thermal camera. Kinect can sense the depth of the subject and create a 3D point cloud out of it. Thermal camera can sense the body heat of a person under clothing. Fusing these two sensors\u27 images could produce a thermal mapped 3D point cloud of the subject and from that body parameters could be estimated even under various cloths. Moreover, this fusion system is also a cost effective one. In our experiment, we introduce a new pipeline for working with our fusion scanning system, and estimate and recover body shape under clothing. We capture Thermal-Kinect fusion images of the subjects with different clothing and produce both full and partial 3D point clouds. To recover the missing parts from our low resolution scan we fit parametric human model on our images and perform boolean operations with our scan data. Further, we measure our final 3D point cloud scan to estimate the body parameters and compare it with the ground truth. We achieve a minimum average error rate of 0.75 cm comparing to other approaches

    Computer Vision algorithms performance in architectural heritage multi-image based projects. General overview and operative evaluation: the North Tower of Buñol's Castle (Spain)

    Full text link
    [EN] Multi-image based modeling has proven to be effective providing solutions for surveying and documenting cultural heritage, and in particular architectural heritage. In addition to the issues related with instruments and captation strategy, the operativity of these projects is supported by three bases: Computer Vision (C.V.) algorithms, analytical close-range photogrammetry, and theory of errors. In this work we propose an approach that examines the importance of the first, from two points of view. On one hand, we present a brief overview of its intervention in the different processing stages, both in photomodeling as in photograms stitching projects, thus reviewing the fundaments regarding the two classic branches of architectural photogrammetry. On the other, we present a review of the operational strategy with these algorithms, through a case study that evaluates the results of two software applications, advancing some methodological improvements.Cabanes Ginés, JL.; Bonafé, C. (2021). Computer Vision algorithms performance in architectural heritage multi-image based projects. General overview and operative evaluation: the North Tower of Buñol's Castle (Spain). SCIRES-IT. 11(2):125-138. https://doi.org/10.2423/i22394303v11n2p12512513811

    Structureless Camera Motion Estimation of Unordered Omnidirectional Images

    Get PDF
    This work aims at providing a novel camera motion estimation pipeline from large collections of unordered omnidirectional images. In oder to keep the pipeline as general and flexible as possible, cameras are modelled as unit spheres, allowing to incorporate any central camera type. For each camera an unprojection lookup is generated from intrinsics, which is called P2S-map (Pixel-to-Sphere-map), mapping pixels to their corresponding positions on the unit sphere. Consequently the camera geometry becomes independent of the underlying projection model. The pipeline also generates P2S-maps from world map projections with less distortion effects as they are known from cartography. Using P2S-maps from camera calibration and world map projection allows to convert omnidirectional camera images to an appropriate world map projection in oder to apply standard feature extraction and matching algorithms for data association. The proposed estimation pipeline combines the flexibility of SfM (Structure from Motion) - which handles unordered image collections - with the efficiency of PGO (Pose Graph Optimization), which is used as back-end in graph-based Visual SLAM (Simultaneous Localization and Mapping) approaches to optimize camera poses from large image sequences. SfM uses BA (Bundle Adjustment) to jointly optimize camera poses (motion) and 3d feature locations (structure), which becomes computationally expensive for large-scale scenarios. On the contrary PGO solves for camera poses (motion) from measured transformations between cameras, maintaining optimization managable. The proposed estimation algorithm combines both worlds. It obtains up-to-scale transformations between image pairs using two-view constraints, which are jointly scaled using trifocal constraints. A pose graph is generated from scaled two-view transformations and solved by PGO to obtain camera motion efficiently even for large image collections. Obtained results can be used as input data to provide initial pose estimates for further 3d reconstruction purposes e.g. to build a sparse structure from feature correspondences in an SfM or SLAM framework with further refinement via BA. The pipeline also incorporates fixed extrinsic constraints from multi-camera setups as well as depth information provided by RGBD sensors. The entire camera motion estimation pipeline does not need to generate a sparse 3d structure of the captured environment and thus is called SCME (Structureless Camera Motion Estimation).:1 Introduction 1.1 Motivation 1.1.1 Increasing Interest of Image-Based 3D Reconstruction 1.1.2 Underground Environments as Challenging Scenario 1.1.3 Improved Mobile Camera Systems for Full Omnidirectional Imaging 1.2 Issues 1.2.1 Directional versus Omnidirectional Image Acquisition 1.2.2 Structure from Motion versus Visual Simultaneous Localization and Mapping 1.3 Contribution 1.4 Structure of this Work 2 Related Work 2.1 Visual Simultaneous Localization and Mapping 2.1.1 Visual Odometry 2.1.2 Pose Graph Optimization 2.2 Structure from Motion 2.2.1 Bundle Adjustment 2.2.2 Structureless Bundle Adjustment 2.3 Corresponding Issues 2.4 Proposed Reconstruction Pipeline 3 Cameras and Pixel-to-Sphere Mappings with P2S-Maps 3.1 Types 3.2 Models 3.2.1 Unified Camera Model 3.2.2 Polynomal Camera Model 3.2.3 Spherical Camera Model 3.3 P2S-Maps - Mapping onto Unit Sphere via Lookup Table 3.3.1 Lookup Table as Color Image 3.3.2 Lookup Interpolation 3.3.3 Depth Data Conversion 4 Calibration 4.1 Overview of Proposed Calibration Pipeline 4.2 Target Detection 4.3 Intrinsic Calibration 4.3.1 Selected Examples 4.4 Extrinsic Calibration 4.4.1 3D-2D Pose Estimation 4.4.2 2D-2D Pose Estimation 4.4.3 Pose Optimization 4.4.4 Uncertainty Estimation 4.4.5 PoseGraph Representation 4.4.6 Bundle Adjustment 4.4.7 Selected Examples 5 Full Omnidirectional Image Projections 5.1 Panoramic Image Stitching 5.2 World Map Projections 5.3 World Map Projection Generator for P2S-Maps 5.4 Conversion between Projections based on P2S-Maps 5.4.1 Proposed Workflow 5.4.2 Data Storage Format 5.4.3 Real World Example 6 Relations between Two Camera Spheres 6.1 Forward and Backward Projection 6.2 Triangulation 6.2.1 Linear Least Squares Method 6.2.2 Alternative Midpoint Method 6.3 Epipolar Geometry 6.4 Transformation Recovery from Essential Matrix 6.4.1 Cheirality 6.4.2 Standard Procedure 6.4.3 Simplified Procedure 6.4.4 Improved Procedure 6.5 Two-View Estimation 6.5.1 Evaluation Strategy 6.5.2 Error Metric 6.5.3 Evaluation of Estimation Algorithms 6.5.4 Concluding Remarks 6.6 Two-View Optimization 6.6.1 Epipolar-Based Error Distances 6.6.2 Projection-Based Error Distances 6.6.3 Comparison between Error Distances 6.7 Two-View Translation Scaling 6.7.1 Linear Least Squares Estimation 6.7.2 Non-Linear Least Squares Optimization 6.7.3 Comparison between Initial and Optimized Scaling Factor 6.8 Homography to Identify Degeneracies 6.8.1 Homography for Spherical Cameras 6.8.2 Homography Estimation 6.8.3 Homography Optimization 6.8.4 Homography and Pure Rotation 6.8.5 Homography in Epipolar Geometry 7 Relations between Three Camera Spheres 7.1 Three View Geometry 7.2 Crossing Epipolar Planes Geometry 7.3 Trifocal Geometry 7.4 Relation between Trifocal, Three-View and Crossing Epipolar Planes 7.5 Translation Ratio between Up-To-Scale Two-View Transformations 7.5.1 Structureless Determination Approaches 7.5.2 Structure-Based Determination Approaches 7.5.3 Comparison between Proposed Approaches 8 Pose Graphs 8.1 Optimization Principle 8.2 Solvers 8.2.1 Additional Graph Solvers 8.2.2 False Loop Closure Detection 8.3 Pose Graph Generation 8.3.1 Generation of Synthetic Pose Graph Data 8.3.2 Optimization of Synthetic Pose Graph Data 9 Structureless Camera Motion Estimation 9.1 SCME Pipeline 9.2 Determination of Two-View Translation Scale Factors 9.3 Integration of Depth Data 9.4 Integration of Extrinsic Camera Constraints 10 Camera Motion Estimation Results 10.1 Directional Camera Images 10.2 Omnidirectional Camera Images 11 Conclusion 11.1 Summary 11.2 Outlook and Future Work Appendices A.1 Additional Extrinsic Calibration Results A.2 Linear Least Squares Scaling A.3 Proof Rank Deficiency A.4 Alternative Derivation Midpoint Method A.5 Simplification of Depth Calculation A.6 Relation between Epipolar and Circumferential Constraint A.7 Covariance Estimation A.8 Uncertainty Estimation from Epipolar Geometry A.9 Two-View Scaling Factor Estimation: Uncertainty Estimation A.10 Two-View Scaling Factor Optimization: Uncertainty Estimation A.11 Depth from Adjoining Two-View Geometries A.12 Alternative Three-View Derivation A.12.1 Second Derivation Approach A.12.2 Third Derivation Approach A.13 Relation between Trifocal Geometry and Alternative Midpoint Method A.14 Additional Pose Graph Generation Examples A.15 Pose Graph Solver Settings A.16 Additional Pose Graph Optimization Examples Bibliograph

    An empirical assessment of real-time progressive stereo reconstruction

    Get PDF
    3D reconstruction from images, the problem of reconstructing depth from images, is one of the most well-studied problems within computer vision. In part because it is academically interesting, but also because of the significant growth in the use of 3D models. This growth can be attributed to the development of augmented reality, 3D printing and indoor mapping. Progressive stereo reconstruction is the sequential application of stereo reconstructions to reconstruct a scene. To achieve a reliable progressive stereo reconstruction a combination of best practice algorithms needs to be used. The purpose of this research is to determine the combinat ion of best practice algorithms that lead to the most accurate and efficient progressive stereo reconstruction i.e the best practice combination. In order to obtain a similarity reconstruction the in t rinsic parameters of the camera need to be known. If they are not known they are determined by capturing ten images of a checkerboard with a known calibration pattern from different angles and using the moving plane algori thm. Thereafter in order to perform a near real-time reconstruction frames are acquired and reconstructed simultaneously. For the first pair of frames keypoints are detected and matched using a best practice keypoint detection and matching algorithm. The motion of the camera between the frames is then determined by decomposing the essential matrix which is determined from the fundamental matrix, which is determined using a best practice ego-motion estimation algorithm. Finally the keypoints are reconstructed using a best practice reconstruction algorithm. For sequential frames each frame is paired with t he previous frame and keypoints are therefore only detected in the sequential frame. They are detected , matched and reconstructed in the same fashion as the first pair of frames, however to ensure that the reconstructed points are in the same scale as the points reconstructed from the first pair of frames the motion of the camera between t he frames is estimated from 3D-2D correspondences using a best practice algorithm. If the purpose of progressive reconstruction is for visualization the best practice combination algorithm for keypoint detection was found to be Speeded Up Robust Features (SURF) as it results in more reconstructed points than Scale-Invariant Feature Transform (SIFT). SIFT is however more computationally efficient and thus better suited if the number of reconstructed points does not matter, for example if the purpose of progressive reconstruction is for camera tracking. For all purposes the best practice combination algorithm for matching was found to be optical flow as it is the most efficient and for ego-motion estimation the best practice combination algorithm was found to be the 5-point algorithm as it is robust to points located on planes. This research is significant as the effects of the key steps of progressive reconstruction and the choices made at each step on the accuracy and efficiency of the reconstruction as a whole have never been studied. As a result progressive stereo reconstruction can now be performed in near real-time on a mobile device without compromising the accuracy of reconstruction
    corecore