258 research outputs found

    Gazedirector: Fully articulated eye gaze redirection in video

    Get PDF
    We present GazeDirector, a new approach for eye gaze redirection that uses model-fitting. Our method first tracks the eyes by fitting a multi-part eye region model to video frames using analysis-by-synthesis, thereby recovering eye region shape, texture, pose, and gaze simultaneously. It then redirects gaze by 1) warping the eyelids from the original image using a model-derived flow field, and 2) rendering and compositing synthesized 3D eyeballs onto the output image in a photorealistic manner. GazeDirector allows us to change where people are looking without person-specific training data, and with full articulation, i.e. we can precisely specify new gaze directions in 3D. Quantitatively, we evaluate both model-fitting and gaze synthesis, with experiments for gaze estimation and redirection on the Columbia gaze dataset. Qualitatively, we compare GazeDirector against recent work on gaze redirection, showing better results especially for large redirection angles. Finally, we demonstrate gaze redirection on YouTube videos by introducing new 3D gaze targets and by manipulating visual behavior

    Function-based Intersubject Alignment of Human Cortical Anatomy

    Get PDF
    Making conclusions about the functional neuroanatomical organization of the human brain requires methods for relating the functional anatomy of an individual's brain to population variability. We have developed a method for aligning the functional neuroanatomy of individual brains based on the patterns of neural activity that are elicited by viewing a movie. Instead of basing alignment on functionally defined areas, whose location is defined as the center of mass or the local maximum response, the alignment is based on patterns of response as they are distributed spatially both within and across cortical areas. The method is implemented in the two-dimensional manifold of an inflated, spherical cortical surface. The method, although developed using movie data, generalizes successfully to data obtained with another cognitive activation paradigm—viewing static images of objects and faces—and improves group statistics in that experiment as measured by a standard general linear model (GLM) analysis

    AFFECT-PRESERVING VISUAL PRIVACY PROTECTION

    Get PDF
    The prevalence of wireless networks and the convenience of mobile cameras enable many new video applications other than security and entertainment. From behavioral diagnosis to wellness monitoring, cameras are increasing used for observations in various educational and medical settings. Videos collected for such applications are considered protected health information under privacy laws in many countries. Visual privacy protection techniques, such as blurring or object removal, can be used to mitigate privacy concern, but they also obliterate important visual cues of affect and social behaviors that are crucial for the target applications. In this dissertation, we propose to balance the privacy protection and the utility of the data by preserving the privacy-insensitive information, such as pose and expression, which is useful in many applications involving visual understanding. The Intellectual Merits of the dissertation include a novel framework for visual privacy protection by manipulating facial image and body shape of individuals, which: (1) is able to conceal the identity of individuals; (2) provide a way to preserve the utility of the data, such as expression and pose information; (3) balance the utility of the data and capacity of the privacy protection. The Broader Impacts of the dissertation focus on the significance of privacy protection on visual data, and the inadequacy of current privacy enhancing technologies in preserving affect and behavioral attributes of the visual content, which are highly useful for behavior observation in educational and medical settings. This work in this dissertation represents one of the first attempts in achieving both goals simultaneously

    SELF-IMAGE MULTIMEDIA TECHNOLOGIES FOR FEEDFORWARD OBSERVATIONAL LEARNING

    Get PDF
    This dissertation investigates the development and use of self-images in augmented reality systems for learning and learning-based activities. This work focuses on self- modeling, a particular form of learning, actively employed in various settings for therapy or teaching. In particular, this work aims to develop novel multimedia systems to support the display and rendering of augmented self-images. It aims to use interactivity (via games) as a means of obtaining imagery for use in creating augmented self-images. Two multimedia systems are developed, discussed and analyzed. The proposed systems are validated in terms of their technical innovation and their clinical efficacy in delivering behavioral interventions for young children on the autism spectrum

    Koehenkilöiden suorituskykymittaukset: kuvataajuuden kasvattaminen ja latenssin kompensointi käyttäen kuvapohjaista renderöintia

    Get PDF
    Traditionally in computer graphics complex 3D scenes are represented as a collection of more primitive geometric surfaces. The geometric representation is then rendered into a 2D raster image suitable for display devices. Image based rendering is an interesting addition to a geometry based rendering. Performance is constrained only by display resolution, and not by scene geometry complexity or shader complexity. When used together with a geometry based renderer, an image based renderer can extrapolate additional frames into an animation sequence based on geometrically rendered frames. Existing research into image based rendering methods is investigated in context of interactive computer graphics. Also an image based renderer is implemented to run on a modern GPU shader architecture. Finally, it’s used in a first person shooter game experiment to measure task performance when using frame rate upconversion. Image based rendering is found to be promising for frame rate upconversion as well as for latency compensation. An implementation of an image based renderer is found feasible on modern GPUs. The experiment results show considerable improvement in test subject hit rates when using frame rate upconversion with latency compensation.Perinteisesti tietokonegrafiikassa monimutkaiset kolmiulotteiset maisemat kuvaillaan yksinkertaisempien geometristen pintojen kokoelmana. Geometrisesta kuvauksesta renderöidään kaksiulotteinen näyttöille sopiva rasterikuva. Kuvapohjainen renderöinti on mielenkiintoinen lisäys geometriapohjaisen renderöinnin rinnalle. Suorituskyky ei riipu virtuaalimaiseman geometrisestä monimutkaisuudesta tai varjostustehosteiden raskaudesta, vaan ainoastaan näytön erottelukyvystä. Yhdessä geometriapohjaisen renderöinnin kanssa käytettynä kuvapohjainen renderöija voi ekstrapoloida uusia kuvia animaatiosekvenssiin vanhojen tavallisesti renderöitujen kuvien perusteella. Kuvapohjaista renderöintia tutkitaan vuorovaikutteisen tietokonegrafiikan näkökulmasta olemassa olevan kirjallisuuden pohjalta. Lisäksi toteutetaan kuvapohjainen renderöija nykyaikaisille grafiikkasuorittimille. Lopuksi toteutetaan käyttäjäkoe käyttäen kuvapohjaista renderöijaa kuvataajuuden kasvattamiseksi, jossa koehenkilöiden suorituskykyä mitataan ammuskelupelissä. Kuvapohjainen renderöinti todetaan lupaavaksi keinoksi kuvataajuuden kasvattamiseksi ja latenssin kompensointiin. Kuvapohjaisen renderöijan toteuttaminen nykyaikaiselle grafiikkasuorittimille todetaan mahdolliseksi. Käyttäjäkokeen tulokset osoittavat, että koehenkilöiden osumatarkkuus koheni merkittävästi kun käytettiin kuvataajuuden kasvattamista ja latenssin kompensointia

    Survey of image-based representations and compression techniques

    Get PDF
    In this paper, we survey the techniques for image-based rendering (IBR) and for compressing image-based representations. Unlike traditional three-dimensional (3-D) computer graphics, in which 3-D geometry of the scene is known, IBR techniques render novel views directly from input images. IBR techniques can be classified into three categories according to how much geometric information is used: rendering without geometry, rendering with implicit geometry (i.e., correspondence), and rendering with explicit geometry (either with approximate or accurate geometry). We discuss the characteristics of these categories and their representative techniques. IBR techniques demonstrate a surprising diverse range in their extent of use of images and geometry in representing 3-D scenes. We explore the issues in trading off the use of images and geometry by revisiting plenoptic-sampling analysis and the notions of view dependency and geometric proxies. Finally, we highlight compression techniques specifically designed for image-based representations. Such compression techniques are important in making IBR techniques practical.published_or_final_versio

    Computationally efficient deformable 3D object tracking with a monocular RGB camera

    Get PDF
    182 p.Monocular RGB cameras are present in most scopes and devices, including embedded environments like robots, cars and home automation. Most of these environments have in common a significant presence of human operators with whom the system has to interact. This context provides the motivation to use the captured monocular images to improve the understanding of the operator and the surrounding scene for more accurate results and applications.However, monocular images do not have depth information, which is a crucial element in understanding the 3D scene correctly. Estimating the three-dimensional information of an object in the scene using a single two-dimensional image is already a challenge. The challenge grows if the object is deformable (e.g., a human body or a human face) and there is a need to track its movements and interactions in the scene.Several methods attempt to solve this task, including modern regression methods based on Deep NeuralNetworks. However, despite the great results, most are computationally demanding and therefore unsuitable for several environments. Computational efficiency is a critical feature for computationally constrained setups like embedded or onboard systems present in robotics and automotive applications, among others.This study proposes computationally efficient methodologies to reconstruct and track three-dimensional deformable objects, such as human faces and human bodies, using a single monocular RGB camera. To model the deformability of faces and bodies, it considers two types of deformations: non-rigid deformations for face tracking, and rigid multi-body deformations for body pose tracking. Furthermore, it studies their performance on computationally restricted devices like smartphones and onboard systems used in the automotive industry. The information extracted from such devices gives valuable insight into human behaviour a crucial element in improving human-machine interaction.We tested the proposed approaches in different challenging application fields like onboard driver monitoring systems, human behaviour analysis from monocular videos, and human face tracking on embedded devices

    Iterative Solvers for Physics-based Simulations and Displays

    Full text link
    La génération d’images et de simulations réalistes requiert des modèles complexes pour capturer tous les détails d’un phénomène physique. Les équations mathématiques qui composent ces modèles sont compliquées et ne peuvent pas être résolues analytiquement. Des procédures numériques doivent donc être employées pour obtenir des solutions approximatives à ces modèles. Ces procédures sont souvent des algorithmes itératifs, qui calculent une suite convergente vers la solution désirée à partir d’un essai initial. Ces méthodes sont une façon pratique et efficace de calculer des solutions à des systèmes complexes, et sont au coeur de la plupart des méthodes de simulation modernes. Dans cette thèse par article, nous présentons trois projets où les algorithmes itératifs jouent un rôle majeur dans une méthode de simulation ou de rendu. Premièrement, nous présentons une méthode pour améliorer la qualité visuelle de simulations fluides. En créant une surface de haute résolution autour d’une simulation existante, stabilisée par une méthode itérative, nous ajoutons des détails additionels à la simulation. Deuxièmement, nous décrivons une méthode de simulation fluide basée sur la réduction de modèle. En construisant une nouvelle base de champ de vecteurs pour représenter la vélocité d’un fluide, nous obtenons une méthode spécifiquement adaptée pour améliorer les composantes itératives de la simulation. Finalement, nous présentons un algorithme pour générer des images de haute qualité sur des écrans multicouches dans un contexte de réalité virtuelle. Présenter des images sur plusieurs couches demande des calculs additionels à coût élevé, mais nous formulons le problème de décomposition des images afin de le résoudre efficacement avec une méthode itérative simple.Realistic computer-generated images and simulations require complex models to properly capture the many subtle behaviors of each physical phenomenon. The mathematical equations underlying these models are complicated, and cannot be solved analytically. Numerical procedures must thus be used to obtain approximate solutions. These procedures are often iterative algorithms, where an initial guess is progressively improved to converge to a desired solution. Iterative methods are a convenient and efficient way to compute solutions to complex systems, and are at the core of most modern simulation methods. In this thesis by publication, we present three papers where iterative algorithms play a major role in a simulation or rendering method. First, we propose a method to improve the visual quality of fluid simulations. By creating a high-resolution surface representation around an input fluid simulation, stabilized with iterative methods, we introduce additional details atop of the simulation. Second, we describe a method to compute fluid simulations using model reduction. We design a novel vector field basis to represent fluid velocity, creating a method specifically tailored to improve all iterative components of the simulation. Finally, we present an algorithm to compute high-quality images for multifocal displays in a virtual reality context. Displaying images on multiple display layers incurs significant additional costs, but we formulate the image decomposition problem so as to allow an efficient solution using a simple iterative algorithm

    Computationally efficient deformable 3D object tracking with a monocular RGB camera

    Get PDF
    182 p.Monocular RGB cameras are present in most scopes and devices, including embedded environments like robots, cars and home automation. Most of these environments have in common a significant presence of human operators with whom the system has to interact. This context provides the motivation to use the captured monocular images to improve the understanding of the operator and the surrounding scene for more accurate results and applications.However, monocular images do not have depth information, which is a crucial element in understanding the 3D scene correctly. Estimating the three-dimensional information of an object in the scene using a single two-dimensional image is already a challenge. The challenge grows if the object is deformable (e.g., a human body or a human face) and there is a need to track its movements and interactions in the scene.Several methods attempt to solve this task, including modern regression methods based on Deep NeuralNetworks. However, despite the great results, most are computationally demanding and therefore unsuitable for several environments. Computational efficiency is a critical feature for computationally constrained setups like embedded or onboard systems present in robotics and automotive applications, among others.This study proposes computationally efficient methodologies to reconstruct and track three-dimensional deformable objects, such as human faces and human bodies, using a single monocular RGB camera. To model the deformability of faces and bodies, it considers two types of deformations: non-rigid deformations for face tracking, and rigid multi-body deformations for body pose tracking. Furthermore, it studies their performance on computationally restricted devices like smartphones and onboard systems used in the automotive industry. The information extracted from such devices gives valuable insight into human behaviour a crucial element in improving human-machine interaction.We tested the proposed approaches in different challenging application fields like onboard driver monitoring systems, human behaviour analysis from monocular videos, and human face tracking on embedded devices

    Efficient image-based rendering

    Get PDF
    Recent advancements in real-time ray tracing and deep learning have significantly enhanced the realism of computer-generated images. However, conventional 3D computer graphics (CG) can still be time-consuming and resource-intensive, particularly when creating photo-realistic simulations of complex or animated scenes. Image-based rendering (IBR) has emerged as an alternative approach that utilizes pre-captured images from the real world to generate realistic images in real-time, eliminating the need for extensive modeling. Although IBR has its advantages, it faces challenges in providing the same level of control over scene attributes as traditional CG pipelines and accurately reproducing complex scenes and objects with different materials, such as transparent objects. This thesis endeavors to address these issues by harnessing the power of deep learning and incorporating the fundamental principles of graphics and physical-based rendering. It offers an efficient solution that enables interactive manipulation of real-world dynamic scenes captured from sparse views, lighting positions, and times, as well as a physically-based approach that facilitates accurate reproduction of the view dependency effect resulting from the interaction between transparent objects and their surrounding environment. Additionally, this thesis develops a visibility metric that can identify artifacts in the reconstructed IBR images without observing the reference image, thereby contributing to the design of an effective IBR acquisition pipeline. Lastly, a perception-driven rendering technique is developed to provide high-fidelity visual content in virtual reality displays while retaining computational efficiency.Jüngste Fortschritte im Bereich Echtzeit-Raytracing und Deep Learning haben den Realismus computergenerierter Bilder erheblich verbessert. Konventionelle 3DComputergrafik (CG) kann jedoch nach wie vor zeit- und ressourcenintensiv sein, insbesondere bei der Erstellung fotorealistischer Simulationen von komplexen oder animierten Szenen. Das bildbasierte Rendering (IBR) hat sich als alternativer Ansatz herauskristallisiert, bei dem vorab aufgenommene Bilder aus der realen Welt verwendet werden, um realistische Bilder in Echtzeit zu erzeugen, so dass keine umfangreiche Modellierung erforderlich ist. Obwohl IBR seine Vorteile hat, ist es eine Herausforderung, das gleiche Maß an Kontrolle über Szenenattribute zu bieten wie traditionelle CG-Pipelines und komplexe Szenen und Objekte mit unterschiedlichen Materialien, wie z.B. transparente Objekte, akkurat wiederzugeben. In dieser Arbeit wird versucht, diese Probleme zu lösen, indem die Möglichkeiten des Deep Learning genutzt und die grundlegenden Prinzipien der Grafik und des physikalisch basierten Renderings einbezogen werden. Sie bietet eine effiziente Lösung, die eine interaktive Manipulation von dynamischen Szenen aus der realen Welt ermöglicht, die aus spärlichen Ansichten, Beleuchtungspositionen und Zeiten erfasst wurden, sowie einen physikalisch basierten Ansatz, der eine genaue Reproduktion des Effekts der Sichtabhängigkeit ermöglicht, der sich aus der Interaktion zwischen transparenten Objekten und ihrer Umgebung ergibt. Darüber hinaus wird in dieser Arbeit eine Sichtbarkeitsmetrik entwickelt, mit der Artefakte in den rekonstruierten IBR-Bildern identifiziert werden können, ohne das Referenzbild zu betrachten, und die somit zur Entwicklung einer effektiven IBR-Erfassungspipeline beiträgt. Schließlich wird ein wahrnehmungsgesteuertes Rendering-Verfahren entwickelt, um visuelle Inhalte in Virtual-Reality-Displays mit hoherWiedergabetreue zu liefern und gleichzeitig die Rechenleistung zu erhalten
    corecore