315 research outputs found

    Automating joiners

    Get PDF
    Pictures taken from different view points cannot be stitched into a geometrically consistent mosaic, unless the structure of the scene is very special. However, geometrical consistency is not the only criterion for success: incorporating multiple view points into the same picture may produce compelling and informative representations. A multi viewpoint form of visual expression that has recently become highly popular is that of joiners (a term coined by artist David Hockney). Joiners are compositions where photographs are layered on a 2D canvas, with some photographs occluding others and boundaries fully visible. Composing joiners is currently a tedious manual process, especially when a great number of photographs is involved. We are thus interested in automating their construction. Our approach is based on optimizing a cost function encouraging image-to-image consistency which is measured on point-features and along picture boundaries. The optimization looks for consistency in the 2D composition rather than 3D geometrical scene consistency and explicitly considers occlusion between pictures. We illustrate our ideas with a number of experiments on collections of images of objects, people, and outdoor scenes

    Visibility computation through image generalization

    Get PDF
    This dissertation introduces the image generalization paradigm for computing visibility. The paradigm is based on the observation that an image is a powerful tool for computing visibility. An image can be rendered efficiently with the support of graphics hardware and each of the millions of pixels in the image reports a visible geometric primitive. However, the visibility solution computed by a conventional image is far from complete. A conventional image has a uniform sampling rate which can miss visible geometric primitives with a small screen footprint. A conventional image can only find geometric primitives to which there is direct line of sight from the center of projection (i.e. the eye) of the image; therefore, a conventional image cannot compute the set of geometric primitives that become visible as the viewpoint translates, or as time changes in a dynamic dataset. Finally, like any sample-based representation, a conventional image can only confirm that a geometric primitive is visible, but it cannot confirm that a geometric primitive is hidden, as that would require an infinite number of samples to confirm that the primitive is hidden at all of its points. ^ The image generalization paradigm overcomes the visibility computation limitations of conventional images. The paradigm has three elements. (1) Sampling pattern generalization entails adding sampling locations to the image plane where needed to find visible geometric primitives with a small footprint. (2) Visibility sample generalization entails replacing the conventional scalar visibility sample with a higher dimensional sample that records all geometric primitives visible at a sampling location as the viewpoint translates or as time changes in a dynamic dataset; the higher-dimensional visibility sample is computed exactly, by solving visibility event equations, and not through sampling. Another form of visibility sample generalization is to enhance a sample with its trajectory as the geometric primitive it samples moves in a dynamic dataset. (3) Ray geometry generalization redefines a camera ray as the set of 3D points that project at a given image location; this generalization supports rays that are not straight lines, and enables designing cameras with non-linear rays that circumvent occluders to gather samples not visible from a reference viewpoint. ^ The image generalization paradigm has been used to develop visibility algorithms for a variety of datasets, of visibility parameter domains, and of performance-accuracy tradeoff requirements. These include an aggressive from-point visibility algorithm that guarantees finding all geometric primitives with a visible fragment, no matter how small primitive\u27s image footprint, an efficient and robust exact from-point visibility algorithm that iterates between a sample-based and a continuous visibility analysis of the image plane to quickly converge to the exact solution, a from-rectangle visibility algorithm that uses 2D visibility samples to compute a visible set that is exact under viewpoint translation, a flexible pinhole camera that enables local modulations of the sampling rate over the image plane according to an input importance map, an animated depth image that not only stores color and depth per pixel but also a compact representation of pixel sample trajectories, and a curved ray camera that integrates seamlessly multiple viewpoints into a multiperspective image without the viewpoint transition distortion artifacts of prior art methods

    Teaching Unknown Objects by Leveraging Human Gaze and Augmented Reality in Human-Robot Interaction

    Get PDF
    Roboter finden aufgrund ihrer außergewöhnlichen Arbeitsleistung, PrĂ€zision, Effizienz und Skalierbarkeit immer mehr Verwendung in den verschiedensten Anwendungsbereichen. Diese Entwicklung wurde zusĂ€tzlich begĂŒnstigt durch Fortschritte in der KĂŒnstlichen Intelligenz (KI), insbesondere im Maschinellem Lernen (ML). Mit Hilfe moderner neuronaler Netze sind Roboter in der Lage, Objekte in ihrer Umgebung zu erkennen und mit ihnen zu interagieren. Ein erhebliches Manko besteht jedoch darin, dass das Training dieser Objekterkennungsmodelle, in aller Regel mit einer zugrundeliegenden AbhĂ€ngig von umfangreichen DatensĂ€tzen und der VerfĂŒgbarkeit großer Datenmengen einhergeht. Dies ist insbesondere dann problematisch, wenn der konkrete Einsatzort des Roboters und die Umgebung, einschließlich der darin befindlichen Objekte, nicht im Voraus bekannt sind. Die breite und stĂ€ndig wachsende Palette von Objekten macht es dabei praktisch unmöglich, das gesamte Spektrum an existierenden Objekten allein mit bereits zuvor erstellten DatensĂ€tzen vollstĂ€ndig abzudecken. Das Ziel dieser Dissertation war es, einem Roboter unbekannte Objekte mit Hilfe von Human-Robot Interaction (HRI) beizubringen, um ihn von seiner AbhĂ€ngigkeit von Daten sowie den EinschrĂ€nkungen durch vordefinierte Szenarien zu befreien. Die Synergie von Eye Tracking und Augmented Reality (AR) ermöglichte es dem als Lehrer fungierenden Menschen, mit dem Roboter zu kommunizieren und ihn mittels des menschlichen Blickes auf Objekte hinzuweisen. Dieser holistische Ansatz ermöglichte die Konzeption eines multimodalen HRI-Systems, durch das der Roboter Objekte identifizieren und dreidimensional segmentieren konnte, obwohl sie ihm zu diesem Zeitpunkt noch unbekannt waren, um sie anschließend aus unterschiedlichen Blickwinkeln eigenstĂ€ndig zu inspizieren. Anhand der Klasseninformationen, die ihm der Mensch mitteilte, war der Roboter daraufhin in der Lage, die entsprechenden Objekte zu erlernen und spĂ€ter wiederzuerkennen. Mit dem Wissen, das dem Roboter durch diesen auf HRI basierenden Lehrvorgang beigebracht worden war, war dessen FĂ€higkeit Objekte zu erkennen vergleichbar mit den FĂ€higkeiten modernster Objektdetektoren, die auf umfangreichen DatensĂ€tzen trainiert worden waren. Dabei war der Roboter jedoch nicht auf vordefinierte Klassen beschrĂ€nkt, was seine Vielseitigkeit und AnpassungsfĂ€higkeit unter Beweis stellte. Die im Rahmen dieser Dissertation durchgefĂŒhrte Forschung leistete bedeutende BeitrĂ€ge an der Schnittstelle von Machine Learning (ML), AR, Eye Tracking und Robotik. Diese Erkenntnisse tragen nicht nur zum besseren VerstĂ€ndnis der genannten Felder bei, sondern ebnen auch den Weg fĂŒr weitere interdisziplinĂ€re Forschung. Die in dieser Dissertation enthalten wissenschaftlichen Artikel wurden auf hochrangigen Konferenzen in den Bereichen Robotik, Eye Tracking und HRI veröffentlicht.Robots are becoming increasingly popular in a wide range of environments due to their exceptional work capacity, precision, efficiency, and scalability. This development has been further encouraged by advances in Artificial Intelligence (AI), particularly Machine Learning (ML). By employing sophisticated neural networks, robots are given the ability to detect and interact with objects in their vicinity. However, a significant drawback arises from the underlying dependency on extensive datasets and the availability of substantial amounts of training data for these object detection models. This issue becomes particularly problematic when the specific deployment location of the robot and the surroundings, including the objects within it, are not known in advance. The vast and ever-expanding array of objects makes it virtually impossible to comprehensively cover the entire spectrum of existing objects using preexisting datasets alone. The goal of this dissertation was to teach a robot unknown objects in the context of Human-Robot Interaction (HRI) in order to liberate it from its data dependency, unleashing it from predefined scenarios. In this context, the combination of eye tracking and Augmented Reality (AR) created a powerful synergy that empowered the human teacher to seamlessly communicate with the robot and effortlessly point out objects by means of human gaze. This holistic approach led to the development of a multimodal HRI system that enabled the robot to identify and visually segment the Objects of Interest (OOIs) in three-dimensional space, even though they were initially unknown to it, and then examine them autonomously from different angles. Through the class information provided by the human, the robot was able to learn the objects and redetect them at a later stage. Due to the knowledge gained from this HRI based teaching process, the robot’s object detection capabilities exhibited comparable performance to state-of-the-art object detectors trained on extensive datasets, without being restricted to predefined classes, showcasing its versatility and adaptability. The research conducted within the scope of this dissertation made significant contributions at the intersection of ML, AR, eye tracking, and robotics. These findings not only enhance the understanding of these fields, but also pave the way for further interdisciplinary research. The scientific articles included in this dissertation have been published at high-impact conferences in the fields of robotics, eye tracking, and HRI

    Path finding on a spherical self-organizing map using distance transformations

    Get PDF
    Spatialization methods create visualizations that allow users to analyze high-dimensional data in an intuitive manner and facilitates the extraction of meaningful information. Just as geographic maps are simpli ed representations of geographic spaces, these visualizations are esssentially maps of abstract data spaces that are created through dimensionality reduction. While we are familiar with geographic maps for path planning/ nding applications, research into using maps of high-dimensional spaces for such purposes has been largely ignored. However, literature has shown that it is possible to use these maps to track temporal and state changes within a high-dimensional space. A popular dimensionality reduction method that produces a mapping for these purposes is the Self-Organizing Map. By using its topology preserving capabilities with a colour-based visualization method known as the U-Matrix, state transitions can be visualized as trajectories on the resulting mapping. Through these trajectories, one can gather information on the transition path between two points in the original high-dimensional state space. This raises the interesting question of whether or not the Self-Organizing Map can be used to discover the transition path between two points in an n-dimensional space. In this thesis, we use a spherically structured Self-Organizing Map called the Geodesic Self-Organizing Map for dimensionality reduction and the creation of a topological mapping that approximates the n-dimensional space. We rst present an intuitive method for a user to navigate the surface of the Geodesic SOM. A new application of the distance transformation algorithm is then proposed to compute the path between two points on the surface of the SOM, which corresponds to two points in the data space. Discussions will then follow on how this application could be improved using some form of surface shape analysis. The new approach presented in this thesis would then be evaluated by analyzing the results of using the Geodesic SOM for manifold embedding and by carrying out data analyses using carbon dioxide emissions data

    Path finding on a spherical self-organizing map using distance transformations

    Get PDF
    Spatialization methods create visualizations that allow users to analyze high-dimensional data in an intuitive manner and facilitates the extraction of meaningful information. Just as geographic maps are simpli ed representations of geographic spaces, these visualizations are esssentially maps of abstract data spaces that are created through dimensionality reduction. While we are familiar with geographic maps for path planning/ nding applications, research into using maps of high-dimensional spaces for such purposes has been largely ignored. However, literature has shown that it is possible to use these maps to track temporal and state changes within a high-dimensional space. A popular dimensionality reduction method that produces a mapping for these purposes is the Self-Organizing Map. By using its topology preserving capabilities with a colour-based visualization method known as the U-Matrix, state transitions can be visualized as trajectories on the resulting mapping. Through these trajectories, one can gather information on the transition path between two points in the original high-dimensional state space. This raises the interesting question of whether or not the Self-Organizing Map can be used to discover the transition path between two points in an n-dimensional space. In this thesis, we use a spherically structured Self-Organizing Map called the Geodesic Self-Organizing Map for dimensionality reduction and the creation of a topological mapping that approximates the n-dimensional space. We rst present an intuitive method for a user to navigate the surface of the Geodesic SOM. A new application of the distance transformation algorithm is then proposed to compute the path between two points on the surface of the SOM, which corresponds to two points in the data space. Discussions will then follow on how this application could be improved using some form of surface shape analysis. The new approach presented in this thesis would then be evaluated by analyzing the results of using the Geodesic SOM for manifold embedding and by carrying out data analyses using carbon dioxide emissions data

    Progressive Refinement Imaging

    Get PDF
    This paper presents a novel technique for progressive online integration of uncalibrated image sequences with substantial geometric and/or photometric discrepancies into a single, geometrically and photometrically consistent image. Our approach can handle large sets of images, acquired from a nearly planar or infinitely distant scene at different resolutions in object domain and under variable local or global illumination conditions. It allows for efficient user guidance as its progressive nature provides a valid and consistent reconstruction at any moment during the online refinement process. // Our approach avoids global optimization techniques, as commonly used in the field of image refinement, and progressively incorporates new imagery into a dynamically extendable and memory‐efficient Laplacian pyramid. Our image registration process includes a coarse homography and a local refinement stage using optical flow. Photometric consistency is achieved by retaining the photometric intensities given in a reference image, while it is being refined. Globally blurred imagery and local geometric inconsistencies due to, e.g. motion are detected and removed prior to image fusion. // We demonstrate the quality and robustness of our approach using several image and video sequences, including handheld acquisition with mobile phones and zooming sequences with consumer cameras

    Techno-voyeurism into a (performing) body

    Get PDF

    TimeBender: Interactive Authoring of 3D Space-Time Narratives

    Get PDF
    Communication of scientific results and discoveries to, for example, fellow domain experts, business partners, students, or the general public, is an important part of research. Communication through visualization has been proven to be effective when the representations are memorable and engaging, and research has shown that these communicative visualizations can be further enhanced with narratives for certain audiences. A challenge faced by scientists is to create memorable and engaging visualizations for communication which traditionally has been done by trained illustrators and designers. Therefore, we created TimeBender, a framework and prototype implementation to bridge this gap specifically for authoring narrative posters in a 3D environment with a space-time dimension. The posters feature multiple scenes forming the narrative, which are connected by an elongated object encoding the narrative flow. We demonstrate that our approach is capable of aiding the authoring of these posters through a 3-step pipeline where, first, the scenes are set up individually, then, the global layout of scenes in the poster space is determined, before details, such as textual elements, are added. TimeBender supports animation as each scene is rendered dynamically within the poster. The framework and example results were evaluated in an expert interview with a professional illustrator.Masteroppgave i informatikkINF399MAMN-PROGMAMN-IN

    Culture as education: From transmediality to transdisciplinary pedagogy

    Get PDF
    For the past three years, the Transmedia Research Group at the Department of Semiotics, University of Tartu, has been developing open access online materials for supporting the teaching of humanities-related subjects in Estonian- and Russianlanguage secondary schools. This paper maps the theoretical and conceptual starting points of these materials. The overarching goal of the educational platforms is to support cultural coherence and autocommunication by cultivating literacies necessary for holding meaningful dialogues with cultural heritage. To achieve the goal, the authors have been seeking ways of purposeful harnessing of transmedial, crossmedial and other tools offered by the contemporary digital communication space. We have started with an understanding of culture as education – a model which is grounded in cultural semiotics and highlights the role of cultural experience and cultural self-description in learning literacies. From these premises we proceed to explicating the value of a transdisciplinary pedagogy for methodical translation of the theoretical concepts into practical solutions in teaching and learning culture
    • 

    corecore