16 research outputs found

    Hemispherical confocal imaging using turtleback reflector

    Get PDF
    We propose a new imaging method called hemispherical confocal imaging to clearly visualize a particular depth in a 3-D scene. The key optical component is a turtleback reflector which is a specially designed polyhedral mirror. By combining the turtleback reflector with a coaxial pair of a camera and a projector, many virtual cameras and projectors are produced on a hemisphere with uniform density to synthesize a hemispherical aperture. In such an optical device, high frequency illumination can be focused at a particular depth in the scene to visualize only the depth with descattering. Then, the observed views are factorized into masking, attenuation, and texture terms to enhance visualization when obstacles are present. Experiments using a prototype system show that only the particular depth is effectively illuminated and hazes by scattering and attenuation can be recovered even when obstacles exist.Microsoft ResearchJapan Society for the Promotion of Science (Grants-in-Aid For Scientific Research 21680017)Japan Society for the Promotion of Science (Grants-in-Aid For Scientific Research 21650038

    Camera and light placement for automated assembly inspection

    Get PDF
    Includes bibliographical references.Visual assembly inspection can provide a low cost, accurate, and efficient solution to the automated assembly inspection problem, which is a crucial component of any automated assembly manufacturing process. The performance of such an inspection system is heavily dependent on the placement of the camera and light source. This article presents new algorithms that use the CAD model of a finished assembly for placing the camera and light source to optimize the performance of an automated assembly inspection algorithm. This general-purpose algorithm utilizes the component material properties and the contact information from the CAD model of the assembly, along with standard computer graphics hardware and physically accurate lighting models, to determine the effects of camera and light source placement on the performance of an inspection algorithm. The effectiveness of the algorithms is illustrated on a typical mechanical assembly.This work was supported by National Science Foundation grant number CDR 8803017 to the Engineering Research Center for Intelligent Manufacturing Systems, National Science Foundation grant number MIP93-00560, an AT&T Bell Laboratories PhD Scholarship, and the NEC Corporation

    Automated visual assembly inspection

    Get PDF
    Includes bibliographical references (pages 699-700).This chapter has discussed an intelligent assembly inspection system that uses a multiscale algorithm to detect errors in assemblies after the algorithm is trained on synthetic CAD images of correctly assembled products. It was shown how the CAD information of an assembly along with fast rendering techniques on specialized graphics machines can be used for the automation of the work-cell camera and light placement. The current emphasis in the manufacturing industry on concurrent engineering will only cause this integration between the CAD model of products and its manufacturing inspection to grow in value

    Efficient, image-based appearance acquisition of real-world objects

    Get PDF
    Two ingredients are necessary to synthesize realistic images: an accurate rendering algorithm and, equally important, high-quality models in terms of geometry and reflection properties. In this dissertation we focus on capturing the appearance of real world objects. The acquired model must represent both the geometry and the reflection properties of the object in order to create new views of the object with novel illumination. Starting from scanned 3D geometry, we measure the reflection properties (BRDF) of the object from images taken under known viewing and lighting conditions. The BRDF measurement require only a small number of input images and is made even more efficient by a view planning algorithm. In particular, we propose algorithms for efficient image-to-geometry registration, and an image-based measurement technique to reconstruct spatially varying materials from a sparse set of images using a point light source. Moreover, we present a view planning algorithm that calculates camera and light source positions for optimal quality and efficiency of the measurement process. Relightable models of real-world objects are requested in various fields such as movie production, e-commerce, digital libraries, and virtual heritage.Zur Synthetisierung realistischer Bilder ist zweierlei nötig: ein akkurates Verfahren zur Beleuchtungsberechnung und, ebenso wichtig, qualitativ hochwertige Modelle, die Geometrie und Reflexionseigenschaften der Szene repräsentieren. Die Aufnahme des Erscheinungbildes realer Gegenstände steht im Mittelpunkt dieser Dissertation. Um beliebige Ansichten eines Gegenstandes unter neuer Beleuchtung zu berechnen, müssen die aufgenommenen Modelle sowohl die Geometrie als auch die Reflexionseigenschaften beinhalten. Ausgehend von einem eingescannten 3D-Geometriemodell, werden die Reflexionseigenschaften (BRDF) anhand von Bildern des Objekts gemessen, die unter kontrollierten Lichtverhältnissen aus verschiedenen Perspektiven aufgenommen wurden. Für die Messungen der BRDF sind nur wenige Eingabebilder erforderlich. Im Speziellen werden Methoden vorgestellt für die Registrierung von Bildern und Geometrie sowie für die bildbasierte Messung von variierenden Materialien. Zur zusätzlichen Steigerung der Effizienz der Aufnahme wie der Qualität des Modells, wurde ein Planungsalgorithmus entwickelt, der optimale Kamera- und Lichtquellenpositionen berechnet. Anwendung finden virtuelle 3D-Modelle bespielsweise in der Filmproduktion, im E-Commerce, in digitalen Bibliotheken wie auch bei der Bewahrung von kulturhistorischem Erbe

    View Synthesis from Image and Video for Object Recognition Applications

    Get PDF
    Object recognition is one of the most important and successful applications in computer vision community. The varying appearances of the test object due to different poses or illumination conditions can make the object recognition problem very challenging. Using view synthesis techniques to generate pose-invariant or illumination-invariant images or videos of the test object is an appealing approach to alleviate the degrading recognition performance due to non-canonical views or lighting conditions. In this thesis, we first present a complete framework for better synthesis and understanding of the human pose from a limited number of available silhouette images. Pose-normalized silhouette images are generated using an active virtual camera and an image based visual hull technique, with the silhouette turning function distance being used as the pose similarity measurement. In order to overcome the inability of the shape from silhouettes method to reonstruct concave regions for human postures, a view synthesis algorithm is proposed for articulating humans using visual hull and contour-based body part segmentation. These two components improve each other for better performance through the correspondence across viewpoints built via the inner distance shape context measurement. Face recognition under varying pose is a challenging problem, especially when illumination variations are also present. We propose two algorithms to address this scenario. For a single light source, we demonstrate a pose-normalized face synthesis approach on a pixel-by-pixel basis from a single view by exploiting the bilateral symmetry of the human face. For more complicated illumination condition, the spherical harmonic representation is extended to encode pose information. An efficient method is proposed for robust face synthesis and recognition with a very compact training set. Finally, we present an end-to-end moving object verification system for airborne video, wherein a homography based view synthesis algorithm is used to simultaneously handle the object's changes in aspect angle, depression angle, and resolution. Efficient integration of spatial and temporal model matching assures the robustness of the verification step. As a byproduct, a robust two camera tracking method using homography is also proposed and demonstrated using challenging surveillance video sequences

    Analysis of the inspection of mechanical parts using dense range data

    Get PDF
    More than ever, efficiency and quality are key words in modern industry. This situation enhances the importance of quality control and creates a great demand for cheap and reliable automatic inspection systems. Taking into account these facts and the demand for systems able to inspect the final shape of machined parts, we decided to investigate the viability of automatic model-based inspection of mechanical parts using the dense range data produced by laser stripers. Given a part to be inspected and a corresponding model of the part stored in the model data base, the first step of inspecting the part is the acquisition of data corresponding to the part, in our case this means the acquisition of a range image of it. In order to be able to compare the part image and its stored model, it is necessary to align the model with the range image of the part. This process, called registration, corresponds to finding the rigid transformation that superposes model and image. After the image and model are registered, the actual inspection uses the range image to verify if all the features predicted in the model are present and have the right pose and dimensions. Therefore, besides the acquisition of range images, the inspection of machined parts involves three main issues: modelling, registration and inspection diagnosis. The application, for inspection purposes, of the main representational schemes for modelling solid objects is discussed and it is suggested the use of EDT models (see [Zeid 91]). A particular implementation of EDT models is presented. A novel approach for the verification of tolerances during the inspection is proposed. The approach allows not only the inspection of the most common tolerances described in the tolerancing standards, but also the inspection of tolerances defined according to Requicha's theory of tolerancing (see [Requicha 83]). A model of the sensitivity and reliability of the inspection process based on the modelling of the errors during the inspection process is also proposed. The importance of the accuracy of the registration in different inspections tasks is discussed. A modified version of the ICP algorithm (see [Besl &; McKay 92]) for the registration of sculptured surfaces is proposed. The maximum accuracy of the ICP algorithm, as a function of the sensor errors and the number of matched points, is determined. A novel method for the measurement and reconstruction of waviness errors on sculp¬ tured surfaces is proposed. The method makes use of the 2D Discrete Fourier Transform for the detection and reconstruction of the waviness error. A model of the sensitivity and reliability of the method is proposed. The application of the methods proposed is illustrated using synthetic and real range image

    Model-Based Environmental Visual Perception for Humanoid Robots

    Get PDF
    The visual perception of a robot should answer two fundamental questions: What? and Where? In order to properly and efficiently reply to these questions, it is essential to establish a bidirectional coupling between the external stimuli and the internal representations. This coupling links the physical world with the inner abstraction models by sensor transformation, recognition, matching and optimization algorithms. The objective of this PhD is to establish this sensor-model coupling

    Task-oriented viewpoint planning for free-form objects

    Get PDF
    A thesis submitted to the Universitat Politècnica de Catalunya to obtain the degree of Doctor of Philosophy. Doctoral programme: Automatic Control, Robotics and Computer Vision. This thesis was completed at: Institut de Robòtica i Informàtica Industrial, CSIC-UPC.[EN]: This thesis deals with active sensing and its use in real exploration tasks under both scene ambiguities and measurement uncertainties. While object modeling is the implicit objective of most of active sensing algorithms, in this work we have explored new strategies to deal with more generic and more complex tasks. Active sensing requires the ability of moving the perceptual system to gather new information. Our approach uses a robot manipulator with a 3D Time-of-Flight (ToF) camera attached to the end-effector. For a complex task, we have focused our attention on plant phenotyping. Plants are complex objects, with leaves that change their position and size along time. Valid viewpoints for a certain plant are hardly valid for a different one, even belonging to the same species. Some instruments, such as chlorophyll meters or disk sampling tools, require being precisely positioned over a particular location of the leaf. Therefore, their use requires the modeling of specific regions of interest of the plant, including also the free space needed for avoiding obstacles and approaching the leaf with tool. It is easy to observe that predefined camera trajectories are not valid here, and that usually with one single view it is very difficult to acquire all the required information. The overall objective of this thesis is to solve complex active sensing tasks by embedding their exploratory goal into a pre-estimated geometrical model, using information-gain as the fundamental guideline for the reward function. The main contributions can be divided in two groups: first, the evaluation of ToF cameras and their calibration to assess the uncertainty of the measurements (presented in Part I); and second, the proposal of a framework capable of embedding the task, modeled as free and occupied space, and that takes into account the modeled sensor's uncertainty to improve the action selection algorithm (presented in Part II). This thesishas given rise to 14 publications, including 5 indexed journals, and its results have been used in the GARNICS European project. The complete framework is based on the Next-Best-View methodology and it can be summarized in the following main steps. First, an initial view of the object (e.g., a plant) is acquired. From this initial view and given a set of candidate viewpoints, the expected gain obtained by moving the robot and acquiring the next image is computed. This computation takes into account the uncertainty from all the different pixels of the sensor, the expected information based on a predefined task model, and the possible occlusions. Once the most promising view is selected, the robot moves, takes a new image, integrates this information intothe model, and evaluates again the set of remaining views. Finally, the task terminates when enough information is gathered. In our examples, this process enables the robot to perform a measurement on top of a leaf. The key ingredient is to model the complexity of the task in a layered representation of free-occupied occupancy grid maps. This allows to naturally encode the requirements of the task, to maintain and update the belief state with the measurements performed, to simulate and compute the expected gains of all potential viewpoints, and to encode the termination condition. During this work the technology of ToF cameras has incredibly evolved. Nowadays it is very popular and ToF cameras are already embedded in some consumer devices. Although the quality of the measurements has been considerably improved, it is still not uniform in the sensor. We believe, as it has been demonstrated in various experiments in this work, that a careful modeling of the sensor's uncertainty is highly beneficial and helps to design better decision systems. In our case, it enables a more realistic computation of the information gain measure, and consequently, a better selection criterion.[CA]: Aquesta tesi aborda el tema de la percepció activa i el seu ús en tasques d'exploració en entorns reals tot considerant la ambigüitat en l'escena i la incertesa del sistema de percepció. Al contrari de la majoria d'algoritmes de percepció activa, on el modelatge d'objectes sol ser l'objectiu implícit, en aquesta tesi hem explorat noves estratègies per poder tractar tasques genèriques i de major complexitat. Tot sistema de percepció activa requereix un aparell sensorial amb la capacitat de variar els seus paràmetres de forma controlada, per poder, d'aquesta manera, recopilar nova informació per resoldre una tasca determinada. En tasques d'exploració, la posició i orientació del sensor són paràmetres claus per resoldre la tasca. En el nostre estudi hem fet ús d'un robot manipulador com a sistema de posicionament i d'una càmera de profunditat de temps de vol (ToF), adherida al seu efector final, com a sistema de percepció. Com a tasca final, ens hem concentrat en l'adquisició de mesures sobre fulles dins de l'àmbit del fenotipatge de les plantes. Les plantes son objectes molt complexos, amb fulles que canvien de textura, posició i mida al llarg del temps. Això comporta diverses dificultats. Per una banda, abans de dur a terme una mesura sobre un fulla s'ha d'explorar l'entorn i trobar una regió que ho permeti. A més a més, aquells punts de vista que han estat adequats per una determinada planta difícilment ho seran per una altra, tot i sent les dues de la mateixa espècie. Per un altra banda, en el moment de la mesura, certs instruments, tals com els mesuradors de clorofil·la o les eines d'extracció de mostres, requereixen ser posicionats amb molta precisió. És necessari, doncs, disposar d'un model detallat d'aquestes regions d'interès, i que inclogui no només l'espai ocupat sinó també el lliure. Gràcies a la modelització de l'espai lliure es pot dur a terme una bona evitació d'obstacles i un bon càlcul de la trajectòria d'aproximació de l'eina a la fulla. En aquest context, és fàcil veure que, en general, amb un sol punt de vistano n'hi haprou per adquirir tota la informació necessària per prendre una mesura, i que l'ús de trajectòries predeterminades no garanteixen l'èxit. L'objectiu general d'aquesta tesi és resoldre tasques complexes de percepció activa mitjançant la codificació del seu objectiu d'exploració en un model geomètric prèviament estimat, fent servir el guany d'informació com a guia fonamental dins de la funció de cost. Les principals contribucions d'aquesta tesi es poden dividir en dos grups: primer, l'avaluació de les càmeres ToF i el seu calibratge per poder avaluar la incertesa de les seves mesures (presentat en la Part I); i en segon lloc, la proposta d'un sistema capaç de codificar la tasca mitjançant el modelatge de l'espai lliure i ocupat, i que té en compte la incertesa del sensor per millorar la selecció de les accions (presentat en la Part II). Aquesta tesi ha donat lloc a 14 publicacions, incloent 5 en revistes indexades, i els resultats obtinguts s'han fet servir en el projecte Europeu GARNICS. La funcionalitat del sistema complet està basada en els mètodes Next-Best-View (següent-millor-vista) i es pot desglossar en els següents passos principals. En primer lloc, s'obté una vista inicial de l'objecte (p. ex., una planta). A partir d'aquesta vista inicial i d'un conjunt de vistes candidates, s'estima, per cada una d'elles, el guany d'informació resultant, tant de moure la càmera com d'obtenir una nova mesura. És rellevant dir que aquest càlcul té en compte la incertesa de cada un dels píxels del sensor, l'estimació de la informació basada en el model de la tasca preestablerta i les possibles oclusions. Un cop seleccionada la vista més prometedora, el robot es mou a la nova posició, pren una nova imatge, integra aquesta informació en el model i torna a avaluar, un altre cop, el conjunt de punts de vista restants. Per últim, la tasca acaba en el moment que es recopila suficient informació.This work has been partially supported by a JAE fellowship of the Spanish Scientific Research Council (CSIC), the Spanish Ministry of Science and Innovation, the Catalan Research Commission and the European Commission under the research projects: DPI2008-06022: PAU: Percepción y acción ante incertidumbre. DPI2011-27510: PAU+: Perception and Action in Robotics Problems with Large State Spaces. 201350E102: MANIPlus: Manipulación robotizada de objetos deformables. 2009-SGR-155: SGR ROBÒTICA: Grup de recerca consolidat - Grup de Robòtica. FP6-2004-IST-4-27657: EU PACO PLUS project. FP7-ICT-2009-4-247947: GARNICS: Gardening with a cognitive system. FP7-ICT-2009-6-269959: IntellAct: Intelligent observation and execution of Actions and manipulations.Peer Reviewe

    Autonomous Optical Inspection of Large Scale Freeform Surfaces

    Get PDF

    Griff-in-die-Kiste - Neue Ansätze für ein klassisches Problem

    Get PDF
    The automation of handling tasks has been an important scientific topic since the development of the first industrial robots. The first step in the chain of scientific challenges to be solved is the automatic grasping of objects. One of the most famous examples in this context is the well known ”bin-picking” problem. To pick up objects, scrambled in a box is an easy task for humans, but its automation is very complex. Besides the localization of the object, meaning the estimation of the object’s pose (orientation and position), it has to be ensured that a collision free path can be found to safely grasp the objects. For over 50 years, researchers have published approaches towards generic solutions to this problem, but unfortunately no industry applicable, generic system has been developed yet. In this thesis, three different approaches to solve the bin-picking problem are described. More precisely, different solutions to the pose estimation problem are introduced, each paired with additional functionalities to complete it for application in a bin-picking station. It is described, how modern sensors can be used for efficient bin-picking as well as how classic sensor concepts can be applied for novel bin-picking techniques. Three complete systems are described and compared. First, 3D point clouds, generated using a laser scanner, are used as basis. Employing the known Random Sample Matching algorithm and modifications of it, paired with a very efficient depth map based collision avoidance mechanism results in a very robust bin-picking approach. In the second approach, all computations are done on depth maps. This allows the use of 2D image analysis techniques to fulfill the tasks and results in real time data analysis. Combined with force/torque and acceleration sensors, a near time optimal bin-picking system emerges. As a third option, surface normal maps are employed as a basis for pose estimation. In contrast to known approaches, the normal maps are not used for 3D data computation but directly for the object localization problem. This enables the application of a new class of sensors for bin-picking. All three methods are compared and advantages and disadvantages of each approach are discussed.Das automatisierte Handling von Objekten ist seit Entwicklung der ersten Roboter ein Forschungsthema. Der erste Schritt in diese Richtung ist das automatische Greifen von Objekten. Eines der berühmtesten Probleme in diesem Zusammenhang ist der "Griff-in-die-Kiste", oder "Bin-Picking". Frei angeordnete Objekte (Schüttgut) aus einer Kiste zu entnehmen stellt für Menschen keine schwierige Aufgabe dar, ist jedoch extrem komplex zu automatisieren. Neben der Objektlokalisierung, also dem Bestimmen der Position und der Orientierung, der Pose, des Objekts muss hier auch gewährleistet werden, dass eine kollisionsfreie Interaktion des Roboters mit dem Objekt möglich ist. Seit mehr als 50 Jahren veröffentlichen Forscher Ansätze, um einer generischen Lösung dieses Problems näher zu kommen. Dennoch ist Bin-Picking auch heute noch nicht vollständig gelöst. Diese Arbeit beschreibt daher drei neue, unterschiedliche Konzepte um das Bin-Picking-Problem zu lösen. Genauer gesagt werden Verfahren vorgestellt, die auf Basis unterschiedlicher Daten Objekte lokalisieren können. Die Arbeit beschreibt, wie moderne optische Sensoren effizient für das Bin-Picking eingesetzt werden können, aber auch, dass klassische Sensorkonzepte neuartige und effiziente Lösungen ermöglichen. Drei Systeme werden beschrieben und verglichen. Zunächst werden per 3D-Scanner aufgenommene Punktwolken als Basis genutzt und mittels Random Sample Matching Objektposen extrahiert. Die Kollisionsvermeidungsstrategie basiert auf Tiefenbildern, was die Berechnung sehr effizient macht. Als zweites wird die Lokalisierung direkt auf Tiefenbildern gerechnet. Dies ermöglicht den direkten Einsatz von 2d Bildverarbeitungsmethoden, was eine Greifposenbestimmung in Echtzeit ermöglicht. Verbunden mit Kraft-Momentensensorik entsteht so ein nahezu zeitoptimales Bin-Picking-System. Als dritte Möglichkeit werden Oberflächennormalenkarten als Basis zur Objektlokalisierung verwendet. Im Gegensatz zu herkömmlichen Ansätzen aus der Literatur werden diese Karten nicht zu 3d Daten umgerechnet sondern direkt zur Posenschätzung genutzt. Dies ermöglicht den Einsatz einer Klasse von Sensoren zum Bin-Picking die bisher nur in anderen Gebieten genutzt werden konnte. Alle drei Methoden werden miteinander verglichen und Vor- sowie Nachteile beleuchtet
    corecore