32 research outputs found

    Visibility computation through image generalization

    Get PDF
    This dissertation introduces the image generalization paradigm for computing visibility. The paradigm is based on the observation that an image is a powerful tool for computing visibility. An image can be rendered efficiently with the support of graphics hardware and each of the millions of pixels in the image reports a visible geometric primitive. However, the visibility solution computed by a conventional image is far from complete. A conventional image has a uniform sampling rate which can miss visible geometric primitives with a small screen footprint. A conventional image can only find geometric primitives to which there is direct line of sight from the center of projection (i.e. the eye) of the image; therefore, a conventional image cannot compute the set of geometric primitives that become visible as the viewpoint translates, or as time changes in a dynamic dataset. Finally, like any sample-based representation, a conventional image can only confirm that a geometric primitive is visible, but it cannot confirm that a geometric primitive is hidden, as that would require an infinite number of samples to confirm that the primitive is hidden at all of its points. ^ The image generalization paradigm overcomes the visibility computation limitations of conventional images. The paradigm has three elements. (1) Sampling pattern generalization entails adding sampling locations to the image plane where needed to find visible geometric primitives with a small footprint. (2) Visibility sample generalization entails replacing the conventional scalar visibility sample with a higher dimensional sample that records all geometric primitives visible at a sampling location as the viewpoint translates or as time changes in a dynamic dataset; the higher-dimensional visibility sample is computed exactly, by solving visibility event equations, and not through sampling. Another form of visibility sample generalization is to enhance a sample with its trajectory as the geometric primitive it samples moves in a dynamic dataset. (3) Ray geometry generalization redefines a camera ray as the set of 3D points that project at a given image location; this generalization supports rays that are not straight lines, and enables designing cameras with non-linear rays that circumvent occluders to gather samples not visible from a reference viewpoint. ^ The image generalization paradigm has been used to develop visibility algorithms for a variety of datasets, of visibility parameter domains, and of performance-accuracy tradeoff requirements. These include an aggressive from-point visibility algorithm that guarantees finding all geometric primitives with a visible fragment, no matter how small primitive\u27s image footprint, an efficient and robust exact from-point visibility algorithm that iterates between a sample-based and a continuous visibility analysis of the image plane to quickly converge to the exact solution, a from-rectangle visibility algorithm that uses 2D visibility samples to compute a visible set that is exact under viewpoint translation, a flexible pinhole camera that enables local modulations of the sampling rate over the image plane according to an input importance map, an animated depth image that not only stores color and depth per pixel but also a compact representation of pixel sample trajectories, and a curved ray camera that integrates seamlessly multiple viewpoints into a multiperspective image without the viewpoint transition distortion artifacts of prior art methods

    Markerless deformation capture of hoverfly wings using multiple calibrated cameras

    Get PDF
    This thesis introduces an algorithm for the automated deformation capture of hoverfly wings from multiple camera image sequences. The algorithm is capable of extracting dense surface measurements, without the aid of fiducial markers, over an arbitrary number of wingbeats of hovering flight and requires limited manual initialisation. A novel motion prediction method, called the ‘normalised stroke model’, makes use of the similarity of adjacent wing strokes to predict wing keypoint locations, which are then iteratively refined in a stereo image registration procedure. Outlier removal, wing fitting and further refinement using independently reconstructed boundary points complete the algorithm. It was tested on two hovering data sets, as well as a challenging flight manoeuvre. By comparing the 3-d positions of keypoints extracted from these surfaces with those resulting from manual identification, the accuracy of the algorithm is shown to approach that of a fully manual approach. In particular, half of the algorithm-extracted keypoints were within 0.17mm of manually identified keypoints, approximately equal to the error of the manual identification process. This algorithm is unique among purely image based flapping flight studies in the level of automation it achieves, and its generality would make it applicable to wing tracking of other insects

    From small to large baseline multiview stereo : dealing with blur, clutter and occlusions

    Get PDF
    This thesis addresses the problem of reconstructing the three-dimensional (3D) digital model of a scene from a collection of two-dimensional (2D) images taken from it. To address this fundamental computer vision problem, we propose three algorithms. They are the main contributions of this thesis. First, we solve multiview stereo with the o -axis aperture camera. This system has a very small baseline as images are captured from viewpoints close to each other. The key idea is to change the size or the 3D location of the aperture of the camera so as to extract selected portions of the scene. Our imaging model takes both defocus and stereo information into account and allows to solve shape reconstruction and image restoration in one go. The o -axis aperture camera can be used in a small-scale space where the camera motion is constrained by the surrounding environment, such as in 3D endoscopy. Second, to solve multiview stereo with large baseline, we present a framework that poses the problem of recovering a 3D surface in the scene as a regularized minimal partition problem of a visibility function. The formulation is convex and hence guarantees that the solution converges to the global minimum. Our formulation is robust to view-varying extensive occlusions, clutter and image noise. At any stage during the estimation process the method does not rely on the visual hull, 2D silhouettes, approximate depth maps, or knowing which views are dependent(i.e., overlapping) and which are independent( i.e., non overlapping). Furthermore, the degenerate solution, the null surface, is not included as a global solution in this formulation. One limitation of this algorithm is that its computation complexity grows with the number of views that we combine simultaneously. To address this limitation, we propose a third formulation. In this formulation, the visibility functions are integrated within a narrow band around the estimated surface by setting weights to each point along optical rays. This thesis presents technical descriptions for each algorithm and detailed analyses to show how these algorithms improve existing reconstruction techniques

    Object Recognition

    Get PDF
    Vision-based object recognition tasks are very familiar in our everyday activities, such as driving our car in the correct lane. We do these tasks effortlessly in real-time. In the last decades, with the advancement of computer technology, researchers and application developers are trying to mimic the human's capability of visually recognising. Such capability will allow machine to free human from boring or dangerous jobs

    Realistic Visualization of Animated Virtual Cloth

    Get PDF
    Photo-realistic rendering of real-world objects is a broad research area with applications in various different areas, such as computer generated films, entertainment, e-commerce and so on. Within photo-realistic rendering, the rendering of cloth is a subarea which involves many important aspects, ranging from material surface reflection properties and macroscopic self-shadowing to animation sequence generation and compression. In this thesis, besides an introduction to the topic plus a broad overview of related work, different methods to handle major aspects of cloth rendering are described. Material surface reflection properties play an important part to reproduce the look & feel of materials, that is, to identify a material only by looking at it. The BTF (bidirectional texture function), as a function of viewing and illumination direction, is an appropriate representation of reflection properties. It captures effects caused by the mesostructure of a surface, like roughness, self-shadowing, occlusion, inter-reflections, subsurface scattering and color bleeding. Unfortunately a BTF data set of a material consists of hundreds to thousands of images, which exceeds current memory size of personal computers by far. This work describes the first usable method to efficiently compress and decompress a BTF data for rendering at interactive to real-time frame rates. It is based on PCA (principal component analysis) of the BTF data set. While preserving the important visual aspects of the BTF, the achieved compression rates allow the storage of several different data sets in main memory of consumer hardware, while maintaining a high rendering quality. Correct handling of complex illumination conditions plays another key role for the realistic appearance of cloth. Therefore, an upgrade of the BTF compression and rendering algorithm is described, which allows the support of distant direct HDR (high-dynamic-range) illumination stored in environment maps. To further enhance the appearance, macroscopic self-shadowing has to be taken into account. For the visualization of folds and the life-like 3D impression, these kind of shadows are absolutely necessary. This work describes two methods to compute these shadows. The first is seamlessly integrated into the illumination part of the rendering algorithm and optimized for static meshes. Furthermore, another method is proposed, which allows the handling of dynamic objects. It uses hardware-accelerated occlusion queries for the visibility determination. In contrast to other algorithms, the presented algorithm, despite its simplicity, is fast and produces less artifacts than other methods. As a plus, it incorporates changeable distant direct high-dynamic-range illumination. The human perception system is the main target of any computer graphics application and can also be treated as part of the rendering pipeline. Therefore, optimization of the rendering itself can be achieved by analyzing human perception of certain visual aspects in the image. As a part of this thesis, an experiment is introduced that evaluates human shadow perception to speedup shadow rendering and provides optimization approaches. Another subarea of cloth visualization in computer graphics is the animation of the cloth and avatars for presentations. This work also describes two new methods for automatic generation and compression of animation sequences. The first method to generate completely new, customizable animation sequences, is based on the concept of finding similarities in animation frames of a given basis sequence. Identifying these similarities allows jumps within the basis sequence to generate endless new sequences. Transmission of any animated 3D data over bandwidth-limited channels, like extended networks or to less powerful clients requires efficient compression schemes. The second method included in this thesis in the animation field is a geometry data compression scheme. Similar to the BTF compression, it uses PCA in combination with clustering algorithms to segment similar moving parts of the animated objects to achieve high compression rates in combination with a very exact reconstruction quality.Realistische Visualisierung von animierter virtueller Kleidung Das photorealistisches Rendering realer Gegenstände ist ein weites Forschungsfeld und hat Anwendungen in vielen Bereichen. Dazu zählen Computer generierte Filme (CGI), die Unterhaltungsindustrie und E-Commerce. Innerhalb dieses Forschungsbereiches ist das Rendern von photorealistischer Kleidung ein wichtiger Bestandteil. Hier reichen die wichtigen Aspekte, die es zu berücksichtigen gilt, von optischen Materialeigenschaften über makroskopische Selbstabschattung bis zur Animationsgenerierung und -kompression. In dieser Arbeit wird, neben der Einführung in das Thema, ein weiter Überblick über ähnlich gelagerte Arbeiten gegeben. Der Schwerpunkt der Arbeit liegt auf den wichtigen Aspekten der virtuellen Kleidungsvisualisierung, die oben beschrieben wurden. Die optischen Reflektionseigenschaften von Materialoberflächen spielen eine wichtige Rolle, um das so genannte look & feel von Materialien zu charakterisieren. Hierbei kann ein Material vom Nutzer identifiziert werden, ohne dass er es direkt anfassen muss. Die BTF (bidirektionale Texturfunktion)ist eine Funktion die abhängig von der Blick- und Beleuchtungsrichtung ist. Daher ist sie eine angemessene Repräsentation von Reflektionseigenschaften. Sie enthält Effekte wie Rauheit, Selbstabschattungen, Verdeckungen, Interreflektionen, Streuung und Farbbluten, die durch die Mesostruktur der Oberfläche hervorgerufen werden. Leider besteht ein BTF Datensatz eines Materials aus hunderten oder tausenden von Bildern und sprengt damit herkömmliche Hauptspeicher in Computern bei weitem. Diese Arbeit beschreibt die erste praktikable Methode, um BTF Daten effizient zu komprimieren, zu speichern und für Echtzeitanwendungen zum Visualisieren wieder zu dekomprimieren. Die Methode basiert auf der Principal Component Analysis (PCA), die Daten nach Signifikanz ordnet. Während die PCA die entscheidenen visuellen Aspekte der BTF erhält, können mit ihrer Hilfe Kompressionsraten erzielt werden, die es erlauben mehrere BTF Materialien im Hauptspeicher eines Consumer PC zu verwalten. Dies erlaubt ein High-Quality Rendering. Korrektes Verwenden von komplexen Beleuchtungssituationen spielt eine weitere, wichtige Rolle, um Kleidung realistisch erscheinen zu lassen. Daher wird zudem eine Erweiterung des BTF Kompressions- und Renderingalgorithmuses erläutert, die den Einsatz von High-Dynamic Range (HDR) Beleuchtung erlaubt, die in environment maps gespeichert wird. Um die realistische Erscheinung der Kleidung weiter zu unterstützen, muss die makroskopische Selbstabschattung integriert werden. Für die Visualisierung von Falten und den lebensechten 3D Eindruck ist diese Art von Schatten absolut notwendig. Diese Arbeit beschreibt daher auch zwei Methoden, diese Schatten schnell und effizient zu berechnen. Die erste ist nahtlos in den Beleuchtungspart des obigen BTF Renderingalgorithmuses integriert und für statische Geometrien optimiert. Die zweite Methode behandelt dynamische Objekte. Dazu werden hardwarebeschleunigte Occlusion Queries verwendet, um die Sichtbarkeitsberechnung durchzuführen. Diese Methode ist einerseits simpel und leicht zu implementieren, anderseits ist sie schnell und produziert weniger Artefakte, als vergleichbare Methoden. Zusätzlich ist die Verwendung von veränderbarer, entfernter HDR Beleuchtung integriert. Das menschliche Wahrnehmungssystem ist das eigentliche Ziel jeglicher Anwendung in der Computergrafik und kann daher selbst als Teil einer erweiterten Rendering Pipeline gesehen werden. Daher kann das Rendering selbst optimiert werden, wenn man die menschliche Wahrnehmung verschiedener visueller Aspekte der berechneten Bilder analysiert. Teil der vorliegenden Arbeit ist die Beschreibung eines Experimentes, das menschliche Schattenwahrnehmung untersucht, um das Rendern der Schatten zu beschleunigen. Ein weiteres Teilgebiet der Kleidungsvisualisierung in der Computergrafik ist die Animation der Kleidung und von Avataren für Präsentationen. Diese Arbeit beschreibt zwei neue Methoden auf diesem Teilgebiet. Einmal ein Algorithmus, der für die automatische Generierung neuer Animationssequenzen verwendet werden kann und zum anderen einen Kompressionsalgorithmus für eben diese Sequenzen. Die automatische Generierung von völlig neuen, anpassbaren Animationen basiert auf dem Konzept der Ähnlichkeitssuche. Hierbei werden die einzelnen Schritte von gegebenen Basisanimationen auf Ähnlichkeiten hin untersucht, die zum Beispiel die Geschwindigkeiten einzelner Objektteile sein können. Die Identifizierung dieser Ähnlichkeiten erlaubt dann Sprünge innerhalb der Basissequenz, die dazu benutzt werden können, endlose, neue Sequenzen zu erzeugen. Die Übertragung von animierten 3D Daten über bandbreitenlimitierte Kanäle wie ausgedehnte Netzwerke, Mobilfunk oder zu sogenannten thin clients erfordert eine effiziente Komprimierung. Die zweite, in dieser Arbeit vorgestellte Methode, ist ein Kompressionsschema für Geometriedaten. Ähnlich wie bei der Kompression von BTF Daten wird die PCA in Verbindung mit Clustering benutzt, um die animierte Geometrie zu analysieren und in sich ähnlich bewegende Teile zu segmentieren. Diese erkannten Segmente lassen sich dann hoch komprimieren. Der Algorithmus arbeitet automatisch und erlaubt zudem eine sehr exakte Rekonstruktionsqualität nach der Dekomprimierung

    Kaleidoscopic imaging

    Get PDF
    Kaleidoscopes have a great potential in computational photography as a tool for redistributing light rays. In time-of-flight imaging the concept of the kaleidoscope is also useful when dealing with the reconstruction of the geometry that causes multiple reflections. This work is a step towards opening new possibilities for the use of mirror systems as well as towards making their use more practical. The focus of this work is the analysis of planar kaleidoscope systems to enable their practical applicability in 3D imaging tasks. We analyse important practical properties of mirror systems and develop a theoretical toolbox for dealing with planar kaleidoscopes. Based on this theoretical toolbox we explore the use of planar kaleidoscopes for multi-view imaging and for the acquisition of 3D objects. The knowledge of the mirrors positions is crucial for these multi-view applications. On the other hand, the reconstruction of the geometry of a mirror room from time-of-flight measurements is also an important problem. We therefore employ the developed tools for solving this problem using multiple observations of a single scene point.Kaleidoskope haben in der rechnergestützten Fotografie ein großes Anwendungspotenzial, da sie flexibel zur Umverteilung von Lichtstrahlen genutzt werden können. Diese Arbeit ist ein Schritt auf dem Weg zu neuen Einsatzmöglichkeiten von Spiegelsystemen und zu ihrer praktischen Anwendung. Das Hauptaugenmerk der Arbeit liegt dabei auf der Analyse planarer Spiegelsysteme mit dem Ziel, sie für Aufgaben in der 3D-Bilderzeugung praktisch nutzbar zu machen. Auch für die Time-of-flight-Technologie ist das Konzept des Kaleidoskops, wie in der Arbeit gezeigt wird, bei der Rekonstruktion von Mehrfachreflektionen erzeugender Geometrie von Nutzen. In der Arbeit wird ein theoretischer Ansatz entwickelt der die Analyse planarer Kaleidoskope stark vereinfacht. Mithilfe dieses Ansatzes wird der Einsatz planarer Spiegelsysteme im Multiview Imaging und bei der Erfassung von 3-D-Objekten untersucht. Das Wissen um die Spiegelpositionen innerhalb des Systems ist für diese Anwendungen entscheidend und erfordert die Entwicklung geeigneter Methoden zur Kalibrierung dieser Positionen. Ein ähnliches Problem tritt in Time-of-Flight Anwendungen bei der, oft unerwünschten, Aufnahme von Mehrfachreflektionen auf. Beide Problemstellungen lassen sich auf die Rekonstruktion der Geometrie eines Spiegelraums zurückführen, das mit Hilfe des entwickelten Ansatzes in allgemeinererWeise als bisher gelöst werden kann

    Map-Based Localization for Unmanned Aerial Vehicle Navigation

    Get PDF
    Unmanned Aerial Vehicles (UAVs) require precise pose estimation when navigating in indoor and GNSS-denied / GNSS-degraded outdoor environments. The possibility of crashing in these environments is high, as spaces are confined, with many moving obstacles. There are many solutions for localization in GNSS-denied environments, and many different technologies are used. Common solutions involve setting up or using existing infrastructure, such as beacons, Wi-Fi, or surveyed targets. These solutions were avoided because the cost should be proportional to the number of users, not the coverage area. Heavy and expensive sensors, for example a high-end IMU, were also avoided. Given these requirements, a camera-based localization solution was selected for the sensor pose estimation. Several camera-based localization approaches were investigated. Map-based localization methods were shown to be the most efficient because they close loops using a pre-existing map, thus the amount of data and the amount of time spent collecting data are reduced as there is no need to re-observe the same areas multiple times. This dissertation proposes a solution to address the task of fully localizing a monocular camera onboard a UAV with respect to a known environment (i.e., it is assumed that a 3D model of the environment is available) for the purpose of navigation for UAVs in structured environments. Incremental map-based localization involves tracking a map through an image sequence. When the map is a 3D model, this task is referred to as model-based tracking. A by-product of the tracker is the relative 3D pose (position and orientation) between the camera and the object being tracked. State-of-the-art solutions advocate that tracking geometry is more robust than tracking image texture because edges are more invariant to changes in object appearance and lighting. However, model-based trackers have been limited to tracking small simple objects in small environments. An assessment was performed in tracking larger, more complex building models, in larger environments. A state-of-the art model-based tracker called ViSP (Visual Servoing Platform) was applied in tracking outdoor and indoor buildings using a UAVs low-cost camera. The assessment revealed weaknesses at large scales. Specifically, ViSP failed when tracking was lost, and needed to be manually re-initialized. Failure occurred when there was a lack of model features in the cameras field of view, and because of rapid camera motion. Experiments revealed that ViSP achieved positional accuracies similar to single point positioning solutions obtained from single-frequency (L1) GPS observations standard deviations around 10 metres. These errors were considered to be large, considering the geometric accuracy of the 3D model used in the experiments was 10 to 40 cm. The first contribution of this dissertation proposes to increase the performance of the localization system by combining ViSP with map-building incremental localization, also referred to as simultaneous localization and mapping (SLAM). Experimental results in both indoor and outdoor environments show sub-metre positional accuracies were achieved, while reducing the number of tracking losses throughout the image sequence. It is shown that by integrating model-based tracking with SLAM, not only does SLAM improve model tracking performance, but the model-based tracker alleviates the computational expense of SLAMs loop closing procedure to improve runtime performance. Experiments also revealed that ViSP was unable to handle occlusions when a complete 3D building model was used, resulting in large errors in its pose estimates. The second contribution of this dissertation is a novel map-based incremental localization algorithm that improves tracking performance, and increases pose estimation accuracies from ViSP. The novelty of this algorithm is the implementation of an efficient matching process that identifies corresponding linear features from the UAVs RGB image data and a large, complex, and untextured 3D model. The proposed model-based tracker improved positional accuracies from 10 m (obtained with ViSP) to 46 cm in outdoor environments, and improved from an unattainable result using VISP to 2 cm positional accuracies in large indoor environments. The main disadvantage of any incremental algorithm is that it requires the camera pose of the first frame. Initialization is often a manual process. The third contribution of this dissertation is a map-based absolute localization algorithm that automatically estimates the camera pose when no prior pose information is available. The method benefits from vertical line matching to accomplish a registration procedure of the reference model views with a set of initial input images via geometric hashing. Results demonstrate that sub-metre positional accuracies were achieved and a proposed enhancement of conventional geometric hashing produced more correct matches - 75% of the correct matches were identified, compared to 11%. Further the number of incorrect matches was reduced by 80%

    From surfaces to objects : Recognizing objects using surface information and object models.

    Get PDF
    This thesis describes research on recognizing partially obscured objects using surface information like Marr's 2D sketch ([MAR82]) and surface-based geometrical object models. The goal of the recognition process is to produce a fully instantiated object hypotheses, with either image evidence for each feature or explanations for their absence, in terms of self or external occlusion. The central point of the thesis is that using surface information should be an important part of the image understanding process. This is because surfaces are the features that directly link perception to the objects perceived (for normal "camera-like" sensing) and because surfaces make explicit information needed to understand and cope with some visual problems (e.g. obscured features). Further, because surfaces are both the data and model primitive, detailed recognition can be made both simpler and more complete. Recognition input is a surface image, which represents surface orientation and absolute depth. Segmentation criteria are proposed for forming surface patches with constant curvature character, based on surface shape discontinuities which become labeled segmentation- boundaries. Partially obscured object surfaces are reconstructed using stronger surface based constraints. Surfaces are grouped to form surface clusters, which are 3D identity-independent solids that often correspond to model primitives. These are used here as a context within which to select models and find all object features. True three-dimensional properties of image boundaries, surfaces and surface clusters are directly estimated using the surface data. Models are invoked using a network formulation, where individual nodes represent potential identities for image structures. The links between nodes are defined by generic and structural relationships. They define indirect evidence relationships for an identity. Direct evidence for the identities comes from the data properties. A plausibility computation is defined according to the constraints inherent in the evidence types. When a node acquires sufficient plausibility, the model is invoked for the corresponding image structure.Objects are primarily represented using a surface-based geometrical model. Assemblies are formed from subassemblies and surface primitives, which are defined using surface shape and boundaries. Variable affixments between assemblies allow flexibly connected objects. The initial object reference frame is estimated from model-data surface relationships, using correspondences suggested by invocation. With the reference frame, back-facing, tangential, partially self-obscured, totally self-obscured and fully visible image features are deduced. From these, the oriented model is used for finding evidence for missing visible model features. IT no evidence is found, the program attempts to find evidence to justify the features obscured by an unrelated object. Structured objects are constructed using a hierarchical synthesis process. Fully completed hypotheses are verified using both existence and identity constraints based on surface evidence. Each of these processes is defined by its computational constraints and are demonstrated on two test images. These test scenes are interesting because they contain partially and fully obscured object features, a variety of surface and solid types and flexibly connected objects. All modeled objects were fully identified and analyzed to the level represented in their models and were also acceptably spatially located. Portions of this work have been reported elsewhere ([FIS83], [FIS85a], [FIS85b], [FIS86]) by the author

    Intelligent gripper design and application for automated part recognition and gripping

    Get PDF
    Intelligent gripping may be achieved through gripper design, automated part recognition, intelligent algorithm for control of the gripper, and on-line decision-making based on sensory data. A generic framework which integrates sensory data, part recognition, decision-making and gripper control to achieve intelligent gripping based on ABB industrial robot is constructed. The three-fingered gripper actuated by a linear servo actuator designed and developed in this project for precise speed and position control is capable of handling a large variety of objects. Generic algorithms for intelligent part recognition are developed. Edge vector representation is discussed. Object geometric features are extracted. Fuzzy logic is successfully utilized to enhance the intelligence of the system. The generic fuzzy logic algorithm, which may also find application in other fields, is presented. Model-based gripping planning algorithm which is capable of extracting object grasp features from its geometric features and reasoning out grasp model for objects with different geometry is proposed. Manipulator trajectory planning solves the problem of generating robot programs automatically. Object-oriented programming technique based on Visual C++ MFC is used to constitute the system software so as to ensure the compatibility, expandability and modular programming design. Hierarchical architecture for intelligent gripping is discussed, which partitions the robot’s functionalities into high-level (modeling, recognizing, planning and perception) layers, and low-level (sensing, interfacing and execute) layers. Individual system modules are integrated seamlessly to constitute the intelligent gripping system

    Seventh Biennial Report : June 2003 - March 2005

    No full text
    corecore