390 research outputs found

    Interaktive Raumzeitrekonstruktion in der Computergraphik

    Get PDF
    High-quality dense spatial and/or temporal reconstructions and correspondence maps from camera images, be it optical flow, stereo or scene flow, are an essential prerequisite for a multitude of computer vision and graphics tasks, e.g. scene editing or view interpolation in visual media production. Due to the ill-posed nature of the estimation problem in typical setups (i.e. limited amount of cameras, limited frame rate), automated estimation approaches are prone to erroneous correspondences and subsequent quality degradation in many non-trivial cases such as occlusions, ambiguous movements, long displacements, or low texture. While improving estimation algorithms is one obvious possible direction, this thesis complementarily concerns itself with creating intuitive, high-level user interactions that lead to improved correspondence maps and scene reconstructions. Where visually convincing results are essential, rendering artifacts resulting from estimation errors are usually repaired by hand with image editing tools, which is time consuming and therefore costly. My new user interactions, which integrate human scene recognition capabilities to guide a semi-automatic correspondence or scene reconstruction algorithm, save considerable effort and enable faster and more efficient production of visually convincing rendered images.Raumzeit-Rekonstruktion in Form von dichten räumlichen und/oder zeitlichen Korrespondenzen zwischen Kamerabildern, sei es optischer Fluss, Stereo oder Szenenfluss, ist eine wesentliche Voraussetzung für eine Vielzahl von Aufgaben in der Computergraphik, zum Beispiel zum Editieren von Szenen oder Bildinterpolation. Da sowohl die Anzahl der Kameras als auch die Bildfrequenz begrenzt sind, ist das Rekonstruktionsproblem unterbestimmt, weswegen automatisierte Schätzungen häufig fehlerhafte Korrespondenzen für nichttriviale Fälle wie Verdeckungen, mehrdeutige oder große Bewegungen, oder einheitliche Texturen enthalten; jede Bildsynthese basierend auf den partiell falschen Schätzungen muß daher Qualitätseinbußen in Kauf nehmen. Man kann nun zum einen versuchen, die Schätzungsalgorithmen zu verbessern. Komplementär dazu kann man möglichst effiziente Interaktionsmöglichkeiten entwickeln, die die Qualität der Rekonstruktion drastisch verbessern. Dies ist das Ziel dieser Dissertation. Für visuell überzeugende Resultate müssen Bildsynthesefehler bislang manuell in einem aufwändigen Nachbearbeitungsschritt mit Hilfe von Bildbearbeitungswerkzeugen korrigiert werden. Meine neuen Benutzerinteraktionen, welche menschliches Szenenverständnis in halbautomatische Algorithmen integrieren, verringern den Nachbearbeitungsaufwand beträchtlich und ermöglichen so eine schnellere und effizientere Produktion qualitativ hochwertiger synthetisierter Bilder

    ParticleSfM: Exploiting Dense Point Trajectories for Localizing Moving Cameras in the Wild

    Full text link
    Estimating the pose of a moving camera from monocular video is a challenging problem, especially due to the presence of moving objects in dynamic environments, where the performance of existing camera pose estimation methods are susceptible to pixels that are not geometrically consistent. To tackle this challenge, we present a robust dense indirect structure-from-motion method for videos that is based on dense correspondence initialized from pairwise optical flow. Our key idea is to optimize long-range video correspondence as dense point trajectories and use it to learn robust estimation of motion segmentation. A novel neural network architecture is proposed for processing irregular point trajectory data. Camera poses are then estimated and optimized with global bundle adjustment over the portion of long-range point trajectories that are classified as static. Experiments on MPI Sintel dataset show that our system produces significantly more accurate camera trajectories compared to existing state-of-the-art methods. In addition, our method is able to retain reasonable accuracy of camera poses on fully static scenes, which consistently outperforms strong state-of-the-art dense correspondence based methods with end-to-end deep learning, demonstrating the potential of dense indirect methods based on optical flow and point trajectories. As the point trajectory representation is general, we further present results and comparisons on in-the-wild monocular videos with complex motion of dynamic objects. Code is available at https://github.com/bytedance/particle-sfm.Comment: ECCV 2022. Project page: http://b1ueber2y.me/projects/ParticleSfM

    Méthodes pour l'évaluation et la prédiction de la Qualité d'expérience, la préférence et l'inconfort visuel dans les applications multimédia. Focus sur la TV 3D stéréoscopique

    Get PDF
    Multimedia technology is aiming to improve people's viewing experience, seeking for better immersiveness and naturalness. The development of HDTV, 3DTV, and Ultra HDTV are recent illustrative examples of this trend. The Quality of Experience (QoE) in multimedia encompass multiple perceptual dimensions. For instance, in 3DTV, three primary dimensions have been identified in literature: image quality, depth quality and visual comfort. In this thesis, focusing on the 3DTV, two basic questions about QoE are studied. One is "how to subjectively assess QoE taking care of its multidimensional aspect?". The other is dedicated to one particular dimension, i.e., "what would induce visual discomfort and how to predict it?". In the first part, the challenges of the subjective assessment on QoE are introduced, and a possible solution called "Paired Comparison" is analyzed. To overcome drawbacks of Paired Comparison method, a new formalism based on a set of optimized paired comparison designs is proposed and evaluated by different subjective experiments. The test results verified efficiency and robustness of this new formalism. An application is the described focusing on the evaluation of the influence factor on 3D QoE. In the second part, the influence of 3D motion on visual discomfort is studied. An objective visual discomfort model is proposed. The model showed high correlation with the subjective data obtained through various experimental conditions. Finally, a physiological study on the relationship between visual discomfort and eye blinking rate is presented.La technologie multimédia vise à améliorer l'expérience visuelle des spectateurs, notamment sur le plan de l'immersion. Les développements récents de la TV HD, TV 3D, et TV Ultra HD s'inscrivent dans cette logique. La qualité d'expérience (QoE) multimédia implique plusieurs dimensions perceptuelles. Dans le cas particulier de la TV 3D stéréoscopique, trois dimensions primaires ont été identifiées dans la littérature: qualité d'image, qualité de la profondeur et confort visuel. Dans cette thèse, deux questions fondamentales sur la QoE sont étudiés. L'une a pour objet "comment évaluer subjectivement le caractère multidimensionnel de la QoE". L'autre s'intéresse à une dimension particuliére de QoE, "la mesure de l'inconfort et sa prédiction?". Dans la première partie, les difficultés de l'évaluation subjective de la QoE sont introduites, les mérites de méthodes de type "Comparaison par paire" (Paired Comparison en anglais) sont analysés. Compte tenu des inconvénients de la méthode de Comparaison par paires, un nouveau formalisme basé sur un ensemble de comparaisons par paires optimisées, est proposé. Celui-ci est évalué au travers de différentes expériences subjectives. Les résultats des tests confirment l'efficacité et la robustesse de ce formalisme. Un exemple d'application dans le cas de l'étude de l'évaluation des facteurs influençant la QoE est ensuite présenté. Dans la seconde partie, l'influence du mouvement tri-dimensionnel (3D) sur l'inconfort visuel est étudié. Un modèle objectif de l'inconfort visuel est proposé. Pour évaluer ce modèle, une expérience subjective de comparaison par paires a été conduite. Ce modèle de prédiction conduit à des corrélations élevées avec les données subjectives. Enfin, une étude sur des mesures physiologiques tentant de relier inconfort visuel et fréquence de clignements des yeux présentée

    A local algorithm for the computation of image velocity via constructive interference of global Fourier components

    Get PDF
    A novel Fourier-based technique for local motion detection from image sequences is proposed. In this method, the instantaneous velocities of local image points are inferred directly from the global 3D Fourier components of the image sequence. This is done by selecting those velocities for which the superposition of the corresponding Fourier gratings leads to constructive interference at the image point. Hence, image velocities can be assigned locally even though position is computed from the phases and amplitudes of global Fourier components (spanning the whole image sequence) that have been filtered based on the motion-constraint equation, reducing certain aperture effects typically arising from windowing in other methods. Regularization is introduced for sequences having smooth flow fields. Aperture effects and their effect on optic-flow regularization are investigated in this context. The algorithm is tested on both synthetic and real image sequences and the results are compared to those of other local methods. Finally, we show that other motion features, i.e. motion direction, can be computed using the same algorithmic framework without requiring an intermediate representation of local velocity, which is an important characteristic of the proposed method.Postprint (author’s final draft
    • …
    corecore