296 research outputs found

    Vergence control system for stereo depth recovery

    Get PDF
    This paper describes a vergence control algorithm for a 3D stereo recovery system. This work has been developed within framework of the project ROBTET. This project has the purpose of designing a Teleoperated Robotic System for live power lines maintenance. The tasks involved suppose the automatic calculation of path for standard tasks, collision detection to avoid electrical shocks, force feedback and accurate visual data, and the generation of collision free real paths. To accomplish these tasks the system needs an exact model of the environment that is acquired through an active stereoscopic head. A cooperative algorithm using vergence and stereo correlation is shown. The proposed system is carried out through an algorithm based on the phase correlation, trying to keep the vergence on the interest object. The sharp vergence changes produced by the variation of the interest objects are controlled through an estimation of the depth distance generated by a stereo correspondence system. In some elements of the scene, those aligned with the epipolar plane, large errors in the depth estimation as well as in the phase correlation, are produced. To minimize these errors a laser lighting system is used to help fixation, assuring an adequate vergence and depth extraction .The work presented in this paper has been supported by electric utility IBERDROLA, S.A. under project PIE No. 132.198

    From Image-based Motion Analysis to Free-Viewpoint Video

    Get PDF
    The problems of capturing real-world scenes with cameras and automatically analyzing the visible motion have traditionally been in the focus of computer vision research. The photo-realistic rendition of dynamic real-world scenes, on the other hand, is a problem that has been investigated in the field of computer graphics. In this thesis, we demonstrate that the joint solution to all three of these problems enables the creation of powerful new tools that are benecial for both research disciplines. Analysis and rendition of real-world scenes with human actors are amongst the most challenging problems. In this thesis we present new algorithmic recipes to attack them. The dissertation consists of three parts: In part I, we present novel solutions to two fundamental problems of human motion analysis. Firstly, we demonstrate a novel hybrid approach for markerfree human motion capture from multiple video streams. Thereafter, a new algorithm for automatic non-intrusive estimation of kinematic body models of arbitrary moving subjects from video is detailed. In part II of the thesis, we demonstrate that a marker-free motion capture approach makes possible the model-based reconstruction of free-viewpoint videos of human actors from only a handful of video streams. The estimated 3D videos enable the photo-realistic real-time rendition of a dynamic scene from arbitrary novel viewpoints. Texture information from video is not only applied to generate a realistic surface appearance, but also to improve the precision of the motion estimation scheme. The commitment to a generic body model also allows us to reconstruct a time-varying reflectance description of an actor`s body surface which allows us to realistically render the free-viewpoint videos under arbitrary lighting conditions. A novel method to capture high-speed large scale motion using regular still cameras and the principle of multi-exposure photography is described in part III. The fundamental principles underlying the methods in this thesis are not only applicable to humans but to a much larger class of subjects. It is demonstrated that, in conjunction, our proposed algorithmic recipes serve as building blocks for the next generation of immersive 3D visual media.Die Entwicklung neuer Algorithmen zur optischen Erfassung und Analyse der Bewegung in dynamischen Szenen ist einer der Forschungsschwerpunkte in der computergestützten Bildverarbeitung. Während im maschinellen Bildverstehen das Augenmerk auf der Extraktion von Informationen liegt, konzentriert sich die Computergrafik auf das inverse Problem, die fotorealistische Darstellung bewegter Szenen. In jüngster Vergangenheit haben sich die beiden Disziplinen kontinuierlich angenähert, da es eine Vielzahl an herausfordernden wissenschaftlichen Fragestellungen gibt, die eine gemeinsame Lösung des Bilderfassungs-, des Bildanalyse- und des Bildsyntheseproblems verlangen. Zwei der schwierigsten Probleme, welche für Forscher aus beiden Disziplinen eine große Relevanz besitzen, sind die Analyse und die Synthese von dynamischen Szenen, in denen Menschen im Mittelpunkt stehen. Im Rahmen dieser Dissertation werden Verfahren vorgestellt, welche die optische Erfassung dieser Art von Szenen, die automatische Analyse der Bewegungen und die realistische neue Darstellung im Computer erlauben. Es wid deutlich werden, dass eine Integration von Algorithmen zur Lösung dieser drei Probleme in ein Gesamtsystem die Erzeugung völlig neuartiger dreidimensionaler Darstellungen von Menschen in Bewegung ermöglicht. Die Dissertation ist in drei Teile gegliedert: Teil I beginnt mit der Beschreibung des Entwurfs und des Baus eines Studios zur zeitsynchronen Erfassung mehrerer Videobildströme. Die im Studio aufgezeichneten Multivideosequenzen dienen als Eingabedaten für die im Rahmen dieser Dissertation entwickelten videogestützten Bewegunsanalyseverfahren und die Algorithmen zur Erzeugung dreidimensionaler Videos. Im Anschluß daran werden zwei neu entwickelte Verfahren vorgestellt, die Antworten auf zwei fundamentale Fragen in der optischen Erfassung menschlicher Bewegung geben, die Messung von Bewegungsparametern und die Erzeugung von kinematischen Skelettmodellen. Das erste Verfahren ist ein hybrider Algorithmus zur markierungslosen optischen Messung von Bewegunsgparametern aus Multivideodaten. Der Verzicht auf optische Markierungen wird dadurch ermöglicht, dass zur Bewegungsanalyse sowohl aus den Bilddaten rekonstruierte Volumenmodelle als auch leicht zu erfassende Körpermerkmale verwendet werden. Das zweite Verfahren dient der automatischen Rekonstruktion eines kinematischen Skelettmodells anhand von Multivideodaten. Der Algorithmus benötigt weder optischen Markierungen in der Szene noch a priori Informationen über die Körperstruktur, und ist in gleicher Form auf Menschen, Tiere und Objekte anwendbar. Das Thema das zweiten Teils dieser Arbeit ist ein modellbasiertes Verfahrenzur Rekonstruktion dreidimensionaler Videos von Menschen in Bewegung aus nur wenigen zeitsynchronen Videoströmen. Der Betrachter kann die errechneten 3D Videos auf einem Computer in Echtzeit abspielen und dabei interaktiv einen beliebigen virtuellen Blickpunkt auf die Geschehnisse einnehmen. Im Zentrum unseres Ansatzes steht ein silhouettenbasierter Analyse-durch-Synthese Algorithmus, der es ermöglicht, ohne optische Markierungen sowohl die Form als auch die Bewegung eines Menschen zu erfassen. Durch die Berechnung zeitveränderlicher Oberächentexturen aus den Videodaten ist gewährleistet, dass eine Person aus jedem beliebigen Blickwinkel ein fotorealistisches Erscheinungsbild besitzt. In einer ersten algorithmischen Erweiterung wird gezeigt, dass die Texturinformation auch zur Verbesserung der Genauigkeit der Bewegunsgssch ätzung eingesetzt werden kann. Zudem ist es durch die Verwendung eines generischen Körpermodells möglich, nicht nur dynamische Texturen sondern sogar dynamische Reektionseigenschaften der Körperoberäche zu messen. Unser Reektionsmodell besteht aus einer parametrischen BRDF für jeden Texel und einer dynamischen Normalenkarte für die gesamte Körperoberäche. Auf diese Weise können 3D Videos auch unter völlig neuen simulierten Beleuchtungsbedingungen realistisch wiedergegeben werden. Teil III dieser Arbeit beschreibt ein neuartiges Verfahren zur optischen Messung sehr schneller Bewegungen. Bisher erforderten optische Aufnahmen von Hochgeschwindigkeitsbewegungen sehr teure Spezialkameras mit hohen Bildraten. Im Gegensatz dazu verwendet die hier beschriebene Methode einfache Digitalfotokameras und das Prinzip der Multiblitzfotograe. Es wird gezeigt, dass mit Hilfe dieses Verfahrens sowohl die sehr schnelle artikulierte Handbewegung des Werfers als auch die Flugparameter des Balls während eines Baseballpitches gemessen werden können. Die hochgenau erfaßten Parameter ermöglichen es, die gemessene Bewegung in völlig neuer Weise im Computer zu visualisieren. Obgleich die in dieser Dissertation vorgestellten Verfahren vornehmlich der Analyse und Darstellung menschlicher Bewegungen dienen, sind die grundlegenden Prinzipien auch auf viele anderen Szenen anwendbar. Jeder der beschriebenen Algorithmen löst zwar in erster Linie ein bestimmtes Teilproblem, aber in Ihrer Gesamtheit können die Verfahren als Bausteine verstanden werden, welche die nächste Generation interaktiver dreidimensionaler Medien ermöglichen werden

    Local Feature Selection and Global Energy Optimization in Stereo

    Get PDF
    The human brain can fuse two slightly different views from left and right eyes and perceive depth. This process of stereopsis entails identifying matching locations in the two images and recovering the depth from their disparity. This can be done only approximately: ambiguity arising from such factors as noise, periodicity, and large regions of constan

    One-shot multispectral color imaging with a stereo camera

    Full text link

    Flying Animal Inspired Behavior-Based Gap-Aiming Autonomous Flight with a Small Unmanned Rotorcraft in a Restricted Maneuverability Environment

    Get PDF
    This dissertation research shows a small unmanned rotorcraft system with onboard processing and a vision sensor can produce autonomous, collision-free flight in a restricted maneuverability environment with no a priori knowledge by using a gap-aiming behavior inspired by flying animals. Current approaches to autonomous flight with small unmanned aerial systems (SUAS) concentrate on detecting and explicitly avoiding obstacles. In contrast, biology indicates that birds, bats, and insects do the opposite; they react to open spaces, or gaps in the environment, with a gap_aiming behavior. Using flying animals as inspiration a behavior-based robotics approach is taken to implement and test their observed gap-aiming behavior in three dimensions. Because biological studies were unclear whether the flying animals were reacting to the largest gap perceived, the closest gap perceived, or all of the gaps three approaches for the perceptual schema were explored in simulation: detect_closest_gap, detect_largest_gap, and detect_all_gaps. The result of these simulations was used in a proof-of-concept implementation on a 3DRobotics Solo quadrotor platform in an environment designed to represent the navigational diffi- culties found inside a restricted maneuverability environment. The motor schema is implemented with an artificial potential field to produce the action of aiming to the center of the gap. Through two sets of field trials totaling fifteen flights conducted with a small unmanned quadrotor, the gap-aiming behavior observed in flying animals is shown to produce repeatable autonomous, collision-free flight in a restricted maneuverability environment. Additionally, using the distance from the starting location to perceived gaps, the horizontal and vertical distance traveled, and the distance from the center of the gap during traversal the implementation of the gap selection approach performs as intended, the three-dimensional movement produced by the motor schema and the accuracy of the motor schema are shown, respectively. This gap-aiming behavior provides the robotics community with the first known implementation of autonomous, collision-free flight on a small unmanned quadrotor without explicit obstacle detection and avoidance as seen with current implementations. Additionally, the testing environment described by quantitative metrics provides a benchmark for autonomous SUAS flight testing in confined environments. Finally, the success of the autonomous collision-free flight implementation on a small unmanned rotorcraft and field tested in a restricted maneuverability environment could have important societal impact in both the public and private sectors

    Three dimensional reconstruction of the cell cytoskeleton from stereo images

    Get PDF
    Thesis (S.M.)--Massachusetts Institute of Technology, Dept. of Mechanical Engineering, 1998.Includes bibliographical references (leaves 80-83).Besides its primary application to robot vision, stereo vision also appears promising in the biomedical field. This study examines 3D reconstruction of the cell cytoskeleton. This application of stereo vision to electron micrographs extracts information about the interior structure of cells at the nanometer scale level. We propose two different types of stereo vision approaches: the line-segment and wavelet multiresolution methods. The former is primitive-based and the latter is a point-based approach. Structural information is stressed in both methods. Directional representation is employed to provide an ideal description for filament-type structures. In the line-segment method, line-segments are first extracted from directional representation and then matching is conducted between two line-segment sets of stereo images. A new search algorithm, matrix matching, is proposed to determine the matching globally. In the wavelet multiresolution method, a pyramidal architecture is presented. Bottom-up analysis is first performed to form two pyramids, containing wavelet decompositions and directional representations. Subsequently, top-down matching is carried out. Matching at a high level provides guidance and constraints to the matching at a lower level. Our reconstructed results reveal 3D structure and the relationships of filaments which are otherwise hard to see in the original stereo images. The method is sufficiently robust and accurate to allow the automated analysis of cell structural characteristics from electron microscopy pairs. The method may also have application to a general class of stereo images.by Yuan Cheng.S.M

    Automated 3D model generation for urban environments [online]

    Get PDF
    Abstract In this thesis, we present a fast approach to automated generation of textured 3D city models with both high details at ground level and complete coverage for birds-eye view. A ground-based facade model is acquired by driving a vehicle equipped with two 2D laser scanners and a digital camera under normal traffic conditions on public roads. One scanner is mounted horizontally and is used to determine the approximate component of relative motion along the movement of the acquisition vehicle via scan matching; the obtained relative motion estimates are concatenated to form an initial path. Assuming that features such as buildings are visible from both ground-based and airborne view, this initial path is globally corrected by Monte-Carlo Localization techniques using an aerial photograph or a Digital Surface Model as a global map. The second scanner is mounted vertically and is used to capture the 3D shape of the building facades. Applying a series of automated processing steps, a texture-mapped 3D facade model is reconstructed from the vertical laser scans and the camera images. In order to obtain an airborne model containing the roof and terrain shape complementary to the facade model, a Digital Surface Model is created from airborne laser scans, then triangulated, and finally texturemapped with aerial imagery. Finally, the facade model and the airborne model are fused to one single model usable for both walk- and fly-thrus. The developed algorithms are evaluated on a large data set acquired in downtown Berkeley, and the results are shown and discussed
    corecore