46 research outputs found


    Get PDF
    Abstract. 3D digital reconstruction techniques are extensively used for quality control purposes. Among them, photogrammetry and photometric stereo methods have been for a long time used with success in several application fields. However, generating highly-detailed and reliable micro-measurements of non-collaborative surfaces is still an open issue. In these cases, photogrammetry can provide accurate low-frequency 3D information, whereas it struggles to extract reliable high-frequency details. Conversely, photometric stereo can recover a very detailed surface topography, although global surface deformation is often present. In this paper, we present the preliminary results of an ongoing project aiming to combine photogrammetry and photometric stereo in a synergetic fusion of the two techniques. Particularly, hereafter, we introduce the main concept design behind an image acquisition system we developed to capture images from different positions and under different lighting conditions as required by photogrammetry and photometric stereo techniques. We show the benefit of such a combination through some experimental tests. The experiments showed that the proposed method recovers the surface topography at the same high-resolution achievable with photometric stereo while preserving the photogrammetric accuracy. Furthermore, we exploit light directionality and multiple light sources to improve the quality of dense image matching in poorly textured surfaces

    Occlusion-Aware Multi-View Reconstruction of Articulated Objects for Manipulation

    Get PDF
    The goal of this research is to develop algorithms using multiple views to automatically recover complete 3D models of articulated objects in unstructured environments and thereby enable a robotic system to facilitate further manipulation of those objects. First, an algorithm called Procrustes-Lo-RANSAC (PLR) is presented. Structure-from-motion techniques are used to capture 3D point cloud models of an articulated object in two different configurations. Procrustes analysis, combined with a locally optimized RANSAC sampling strategy, facilitates a straightforward geometric approach to recovering the joint axes, as well as classifying them automatically as either revolute or prismatic. The algorithm does not require prior knowledge of the object, nor does it make any assumptions about the planarity of the object or scene. Second, with such a resulting articulated model, a robotic system is then able to manipulate the object either along its joint axes at a specified grasp point in order to exercise its degrees of freedom or move its end effector to a particular position even if the point is not visible in the current view. This is one of the main advantages of the occlusion-aware approach, because the models capture all sides of the object meaning that the robot has knowledge of parts of the object that are not visible in the current view. Experiments with a PUMA 500 robotic arm demonstrate the effectiveness of the approach on a variety of real-world objects containing both revolute and prismatic joints. Third, we improve the proposed approach by using a RGBD sensor (Microsoft Kinect) that yield a depth value for each pixel immediately by the sensor itself rather than requiring correspondence to establish depth. KinectFusion algorithm is applied to produce a single high-quality, geometrically accurate 3D model from which rigid links of the object are segmented and aligned, allowing the joint axes to be estimated using the geometric approach. The improved algorithm does not require artificial markers attached to objects, yields much denser 3D models and reduces the computation time

    Computational Depth from Defocus via Active Quasi-random Pattern Projections

    Get PDF
    Depth information is one of the most fundamental cues in interpreting the geometric relationship of objects. It enables machines and robots to perceive the world in 3D and allows them to understand the environment far beyond 2D images. Recovering the depth information of the scene plays a crucial role in computer vision, and hence has a strong connection with many applications in the fields such as robotics, autonomous driving and computer-human interfacing. In this thesis, we proposed, designed, and built a comprehensive system for depth estimation from a single camera capture by leveraging the camera response to the defocus effect of the projected pattern. This approach is fundamentally driven by the concept of active depth from defocus (DfD) which recovers depth by analyzing the defocus effect of the projected pattern at different depth levels as appeared in the captured images. While current active DfD approaches are able to provide high accuracy, they rely on specialized setups to obtain images with different defocus levels, making it impractical for a simple and compact depth-sensing system with a small form factor. The main contribution of this thesis is the use of computational modelling techniques to characterize the camera defocus response of the projection pattern at different depth levels, a new approach in active DfD that enables rapid and accurate depth inference in the absence of complex hardware and extensive computing resources. Specifically, different statistical estimation methods are proposed to approximate the pixel intensity distribution of the projected pattern as measured by the camera sensor, a learning process that essentially summarizes the defocus effect to a handful of optimized, distinctive values. As a result, the blurring appearance of the projected pattern at each depth level is represented by depth features in a computational depth inference model. In the proposed framework, the scene is actively illuminated with a unique quasi-random projection pattern, and a conventional RGB camera is used to acquire an image of the scene. The depth map of the scene can then be recovered by studying the depth feature in the captured image of the blurred projection pattern using the proposed computational depth inference model. To verify the efficacy of the proposed depth estimation approach, quantitative and qualitative experiments are performed on test scenes with different structural characteristics. The results demonstrate that the proposed method can produce accurate depth reconstruction results with high fidelity and has strong potential as a cost effective and computationally efficient mean of generating depth maps

    Stereo vision based on compressed feature correlation and graph cut

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Mechanical Engineering, 2005.Includes bibliographical references (p. 131-145).This dissertation has developed a fast and robust algorithm to solve the dense correspondence problem with a good performance in untextured regions by merging Sparse Array Correlation from the computational fluids community into graph cut from the computer vision community. The proposed methodology consists of two independent modules. The first module is named Compressed Feature Correlation which is originated from Particle Image Velocimetry (PIV). The algorithm uses an image compression scheme that retains pixel values in high-intensity gradient areas while eliminating pixels with little correlation information in smooth surface regions resulting in a highly reduced image datasets. In addition, by utilizing an error correlation function, pixel comparisons are made through single integer calculations eliminating time consuming multiplication and floating point arithmetic. Unlike the traditional fixed window sorting scheme, adaptive correlation window positioning is implemented by dynamically placing strong features at the center of each correlation window. A confidence measure is developed to validate correlation outputs. The sparse depth map generated by this ultra-fast Compressed Feature Correlation may either serve as inputs to global methods or be interpolated into dense depth map when object boundaries are clearly defined. The second module enables a modified graph cut algorithm with an improved energy model that accepts prior information by fixing data energy penalties. The image pixels with known disparity values stabilize and speed up global optimization. As a result less iterations are necessary and sensitivity to parameters is reduced.(cont.) An efficient hybrid approach is implemented based on the above two modules. By coupling a simpler and much less expensive algorithm, Compressed Feature Correlation, with a more expensive algorithm, graph cut, the computational expense of the hybrid calculation is one third of performing the entire calculation using the more expensive of the two algorithms, while accuracy and robustness are improved at the same time. Qualitative and quantitative results on both simulated disparities and real stereo images are presented.by Sheng Sarah Tan.Ph.D

    Rõivaste tekstureerimine kasutades Kinect V2.0

    Get PDF
    This thesis describes three new garment retexturing methods for FitsMe virtual fitting room applications using data from Microsoft Kinect II RGB-D camera. The first method, which is introduced, is an automatic technique for garment retexturing using a single RGB-D image and infrared information obtained from Kinect II. First, the garment is segmented out from the image using GrabCut or depth segmentation. Then texture domain coordinates are computed for each pixel belonging to the garment using normalized 3D information. Afterwards, shading is applied to the new colors from the texture image. The second method proposed in this work is about 2D to 3D garment retexturing where a segmented garment of a manikin or person is matched to a new source garment and retextured, resulting in augmented images in which the new source garment is transferred to the manikin or person. The problem is divided into garment boundary matching based on point set registration which uses Gaussian mixture models and then interpolate inner points using surface topology extracted through geodesic paths, which leads to a more realistic result than standard approaches. The final contribution of this thesis is by introducing another novel method which is used for increasing the texture quality of a 3D model of a garment, by using the same Kinect frame sequence which was used in the model creation. Firstly, a structured mesh must be created from the 3D model, therefore the 3D model is wrapped to a base model with defined seams and texture map. Afterwards frames are matched to the newly created model and by process of ray casting the color values of the Kinect frames are mapped to the UV map of the 3D model

    Analyse und Modellierung dynamischer dreidimensionaler Szenen unter Verwendung einer Laufzeitkamera

    Get PDF
    Many applications in Computer Vision require the automatic analysis and reconstruction of static and dynamic scenes. Therefore the automatic analysis of three-dimensional scenes is an area which is intensively investigated. Most approaches focus on the reconstruction of rigid geometry because the reconstruction of non-rigid geometry is far more challenging and requires that three-dimensional data is available at high frame-rates. Rigid scene analysis is for example used in autonomous navigation, for surveillance and for the conservation of cultural heritage. The analysis and reconstruction of non-rigid geometry on the other hand provides a lot more possibilities, not only for the above-mentioned applications. In the production of media content for television or cinema the analysis, recording and playback of full 3D content can be used to generate new views of real scenes or to replace real actors by animated artificial characters. The most important requirement for the analysis of dynamic content is the availability of reliable three-dimensional scene data. Mostly stereo methods have been used to compute the depth of scene points, but these methods are computationally expensive and do not provide sufficient quality in real-time. In recent years the so-called Time-of-Flight cameras have left the prototype stadium and are now capable to deliver dense depth information in real-time at reasonable quality and price. This thesis investigates the suitability of these cameras for the purpose of dynamic three-dimensional scene analysis. Before a Time-of-Flight camera can be used to analyze three-dimensional scenes it has to be calibrated internally and externally. Moreover, Time-of-Flight cameras suffer from systematic depth measurement errors due to their operation principle. This thesis proposes an approach to estimate all necessary parameters in one calibration step. In the following the reconstruction of rigid environments and objects is investigated and solutions for these tasks are presented. The reconstruction of dynamic scenes and the generation of novel views of dynamic scenes is achieved by the introduction of a volumetric data structure to store and fuse the depth measurements and their change over time. Finally a Mixed Reality system is presented in which the contributions of this thesis are brought together. This system is able to combine real and artificial scene elements with correct mutual occlusion, mutual shadowing and physical interaction. This thesis shows that Time-of-Flight cameras are a suitable choice for the analysis of rigid as well as non-rigid scenes under certain conditions. It contains important contributions for the necessary steps of calibration, preprocessing of depth data and reconstruction and analysis of three-dimensional scenes.Viele Anwendungen des Maschinellen Sehens benötigen die automatische Analyse und Rekonstruktion von statischen und dynamischen Szenen. Deshalb ist die automatische Analyse von dreidimensionalen Szenen und Objekten ein Bereich der intensiv erforscht wird. Die meisten Ansätze konzentrieren sich auf die Rekonstruktion statischer Szenen, da die Rekonstruktion nicht-statischer Geometrien viel herausfordernder ist und voraussetzt, dass dreidimensionale Szeneninformation mit hoher zeitlicher Auflösung verfügbar ist. Statische Szenenanalyse wird beispielsweise in der autonomen Navigation, für die Überwachung und für die Erhaltung des Kulturerbes eingesetzt. Andererseits eröffnet die Analyse und Rekonstruktion nicht-statischer Geometrie viel mehr Möglichkeiten, nicht nur für die bereits erwähnten Anwendungen. In der Produktion von Medieninhalten für Film und Fernsehen kann die Analyse und die Aufnahme und Wiedergabe von vollständig dreidimensionalen Inhalten verwendet werden um neue Ansichten realer Szenen zu erzeugen oder echte Schauspieler durch animierte virtuelle Charaktere zu ersetzen. Die wichtigste Voraussetzung für die Analyse von dynamischen Inhalten ist die Verfügbarkeit von zuverlässigen dreidimensionalen Szeneninformationen. Um die Entfernung von Punkten in der Szene zu bestimmen wurden meistens Stereo-Verfahren eingesetzt, aber diese Verfahren benötigen viel Rechenzeit und erreichen in Echtzeit nicht die benötigte Qualität. In den letzten Jahren haben die so genannten Laufzeitkameras das Stadium der Prototypen verlassen und sind jetzt in der Lage dichte Tiefeninformationen in vernünftiger Qualität zu einem vernünftigen Preis zu liefern. Diese Arbeit untersucht die Eignung dieser Kameras für die Analyse nicht-statischer dreidimensionaler Szenen. Bevor eine Laufzeitkamera für die Analyse eingesetzt werden kann muss sie intern und extern kalibriert werden. Darüber hinaus leiden Laufzeitkameras an systematischen Fehlern bei der Entfernungsmessung, bedingt durch ihr Funktionsprinzip. Diese Arbeit stellt ein Verfahren vor um alle nötigen Parameter in einem Kalibrierschritt zu berechnen. Im Weiteren wird die Rekonstruktion von statischen Umgebungen und Objekten untersucht und Lösungen für diese Aufgaben werden präsentiert. Die Rekonstruktion von nicht-statischen Szenen und die Erzeugung neuer Ansichten solcher Szenen wird mit der Einführung einer volumetrischen Datenstruktur erreicht, in der die Tiefenmessungen und ihr Änderungen über die Zeit gespeichert und fusioniert werden. Schließlich wird ein Mixed Reality System vorgestellt in welchem die Beiträge dieser Arbeit zusammengeführt werden. Dieses System ist in der Lage reale und künstliche Szenenelemente unter Beachtung von korrekter gegenseitiger Verdeckung, Schattenwurf und physikalischer Interaktion zu kombinieren. Diese Arbeit zeigt, dass Laufzeitkameras unter bestimmten Voraussetzungen eine geeignete Wahl für die Analyse von statischen und nicht-statischen Szenen sind. Sie enthält wichtige Beiträge für die notwendigen Schritte der Kalibrierung, der Vorverarbeitung von Tiefendaten und der Rekonstruktion und der Analyse von dreidimensionalen Szenen