1,230 research outputs found
Detail-preserving and Content-aware Variational Multi-view Stereo Reconstruction
Accurate recovery of 3D geometrical surfaces from calibrated 2D multi-view
images is a fundamental yet active research area in computer vision. Despite
the steady progress in multi-view stereo reconstruction, most existing methods
are still limited in recovering fine-scale details and sharp features while
suppressing noises, and may fail in reconstructing regions with few textures.
To address these limitations, this paper presents a Detail-preserving and
Content-aware Variational (DCV) multi-view stereo method, which reconstructs
the 3D surface by alternating between reprojection error minimization and mesh
denoising. In reprojection error minimization, we propose a novel inter-image
similarity measure, which is effective to preserve fine-scale details of the
reconstructed surface and builds a connection between guided image filtering
and image registration. In mesh denoising, we propose a content-aware
-minimization algorithm by adaptively estimating the value and
regularization parameters based on the current input. It is much more promising
in suppressing noise while preserving sharp features than conventional
isotropic mesh smoothing. Experimental results on benchmark datasets
demonstrate that our DCV method is capable of recovering more surface details,
and obtains cleaner and more accurate reconstructions than state-of-the-art
methods. In particular, our method achieves the best results among all
published methods on the Middlebury dino ring and dino sparse ring datasets in
terms of both completeness and accuracy.Comment: 14 pages,16 figures. Submitted to IEEE Transaction on image
processin
Training and Predicting Visual Error for Real-Time Applications
Visual error metrics play a fundamental role in the quantification of
perceived image similarity. Most recently, use cases for them in real-time
applications have emerged, such as content-adaptive shading and shading reuse
to increase performance and improve efficiency. A wide range of different
metrics has been established, with the most sophisticated being capable of
capturing the perceptual characteristics of the human visual system. However,
their complexity, computational expense, and reliance on reference images to
compare against prevent their generalized use in real-time, restricting such
applications to using only the simplest available metrics. In this work, we
explore the abilities of convolutional neural networks to predict a variety of
visual metrics without requiring either reference or rendered images.
Specifically, we train and deploy a neural network to estimate the visual error
resulting from reusing shading or using reduced shading rates. The resulting
models account for 70%-90% of the variance while achieving up to an order of
magnitude faster computation times. Our solution combines image-space
information that is readily available in most state-of-the-art deferred shading
pipelines with reprojection from previous frames to enable an adequate estimate
of visual errors, even in previously unseen regions. We describe a suitable
convolutional network architecture and considerations for data preparation for
training. We demonstrate the capability of our network to predict complex error
metrics at interactive rates in a real-time application that implements
content-adaptive shading in a deferred pipeline. Depending on the portion of
unseen image regions, our approach can achieve up to performance
compared to state-of-the-art methods.Comment: Published at Proceedings of the ACM in Computer Graphics and
Interactive Techniques. 14 Pages, 16 Figures, 3 Tables. For paper website and
higher quality figures, see https://jaliborc.github.io/rt-percept
Towards System Agnostic Calibration of Optical See-Through Head-Mounted Displays for Augmented Reality
This dissertation examines the developments and progress of spatial calibration procedures for Optical See-Through (OST) Head-Mounted Display (HMD) devices for visual Augmented Reality (AR) applications. Rapid developments in commercial AR systems have created an explosion of OST device options for not only research and industrial purposes, but also the consumer market as well. This expansion in hardware availability is equally matched by a need for intuitive standardized calibration procedures that are not only easily completed by novice users, but which are also readily applicable across the largest range of hardware options. This demand for robust uniform calibration schemes is the driving motive behind the original contributions offered within this work. A review of prior surveys and canonical description for AR and OST display developments is provided before narrowing the contextual scope to the research questions evolving within the calibration domain. Both established and state of the art calibration techniques and their general implementations are explored, along with prior user study assessments and the prevailing evaluation metrics and practices employed within. The original contributions begin with a user study evaluation comparing and contrasting the accuracy and precision of an established manual calibration method against a state of the art semi-automatic technique. This is the first formal evaluation of any non-manual approach and provides insight into the current usability limitations of present techniques and the complexities of next generation methods yet to be solved. The second study investigates the viability of a user-centric approach to OST HMD calibration through novel adaptation of manual calibration to consumer level hardware. Additional contributions describe the development of a complete demonstration application incorporating user-centric methods, a novel strategy for visualizing both calibration results and registration error from the user’s perspective, as well as a robust intuitive presentation style for binocular manual calibration. The final study provides further investigation into the accuracy differences observed between user-centric and environment-centric methodologies. The dissertation concludes with a summarization of the contribution outcomes and their impact on existing AR systems and research endeavors, as well as a short look ahead into future extensions and paths that continued calibration research should explore
Computing surface-based photo-consistency on graphics hardware
© Copyright 2005 IEEEThis paper describes a novel approach to the problem of recovering information from an image set by comparing the radiance of hypothesised point correspondences. Our algorithm is applicable to a number of problems in computer vision, but is explained particularly in terms of recovering geometry from an image set. It uses the idea of photo-consistency to measure the confidence that a hypothesised scene description generated the reference images. Photo-consistency has been used in volumetric scene reconstruction where a hypothesised surface is evolved by considering one voxel at a time. Our approach is different: it represents the scene as a parameterised surface so decisions can be made about its photo-consistency simultaneously over the entire surface rather than a series of independent decisions. Our approach is further characterised by its ability to execute on graphics hardware. Experiments demonstrate that our cost function minimises at the solution and is not adversely affected by occlusion
Accelerated volumetric reconstruction from uncalibrated camera views
While both work with images, computer graphics and computer vision are inverse problems. Computer graphics starts traditionally with input geometric models and produces image sequences. Computer vision starts with input image sequences and produces geometric models. In the last few years, there has been a convergence of research to bridge the gap between the two fields.
This convergence has produced a new field called Image-based Rendering and Modeling (IBMR). IBMR represents the effort of using the geometric information recovered from real images to generate new images with the hope that the synthesized
ones appear photorealistic, as well as reducing the time spent on model creation.
In this dissertation, the capturing, geometric and photometric aspects of an IBMR system are studied. A versatile framework was developed that enables the reconstruction of scenes from images acquired with a handheld digital camera. The proposed system targets applications in areas such as Computer Gaming and Virtual Reality, from a lowcost perspective. In the spirit of IBMR, the human operator is allowed to provide the high-level information, while underlying algorithms are used to perform low-level computational work. Conforming to the latest architecture trends, we propose a streaming voxel carving method, allowing a fast GPU-based processing on commodity hardware
It's all Relative: Monocular 3D Human Pose Estimation from Weakly Supervised Data
We address the problem of 3D human pose estimation from 2D input images using
only weakly supervised training data. Despite showing considerable success for
2D pose estimation, the application of supervised machine learning to 3D pose
estimation in real world images is currently hampered by the lack of varied
training images with corresponding 3D poses. Most existing 3D pose estimation
algorithms train on data that has either been collected in carefully controlled
studio settings or has been generated synthetically. Instead, we take a
different approach, and propose a 3D human pose estimation algorithm that only
requires relative estimates of depth at training time. Such training signal,
although noisy, can be easily collected from crowd annotators, and is of
sufficient quality for enabling successful training and evaluation of 3D pose
algorithms. Our results are competitive with fully supervised regression based
approaches on the Human3.6M dataset, despite using significantly weaker
training data. Our proposed algorithm opens the door to using existing
widespread 2D datasets for 3D pose estimation by allowing fine-tuning with
noisy relative constraints, resulting in more accurate 3D poses.Comment: BMVC 2018. Project page available at
http://www.vision.caltech.edu/~mronchi/projects/RelativePos
Foveated Path Tracing with Fast Reconstruction and Efficient Sample Distribution
Polunseuranta on tietokonegrafiikan piirtotekniikka, jota on käytetty pääasiassa ei-reaaliaikaisen realistisen piirron tekemiseen. Polunseuranta tukee luonnostaan monia muilla tekniikoilla vaikeasti saavutettavia todellisen valon ilmiöitä kuten heijastuksia ja taittumista. Reaaliaikainen polunseuranta on hankalaa polunseurannan suuren laskentavaatimuksen takia. Siksi nykyiset reaaliaikaiset polunseurantasysteemi tuottavat erittäin kohinaisia kuvia, jotka tyypillisesti suodatetaan jälkikäsittelykohinanpoisto-suodattimilla.
Erittäin immersiivisiä käyttäjäkokemuksia voitaisiin luoda polunseurannalla, joka täyttäisi laajennetun todellisuuden vaatimukset suuresta resoluutiosta riittävän matalassa vasteajassa. Yksi mahdollinen ratkaisu näiden vaatimusten täyttämiseen voisi olla katsekeskeinen polunseuranta, jossa piirron resoluutiota vähennetään katseen reunoilla. Tämän johdosta piirron laatu on katseen reunoilla sekä harvaa että kohinaista, mikä asettaa suuren roolin lopullisen kuvan koostavalle suodattimelle.
Tässä työssä esitellään ensimmäinen reaaliajassa toimiva regressionsuodatin. Suodatin on suunniteltu kohinaisille kuville, joissa on yksi polunseurantanäyte pikseliä kohden. Nopea suoritus saavutetaan tiileissä käsittelemällä ja nopealla sovituksen toteutuksella. Lisäksi työssä esitellään Visual-Polar koordinaattiavaruus, joka jakaa polunseurantanäytteet siten, että niiden jakauma seuraa silmän herkkyysmallia. Visual-Polar-avaruuden etu muihin tekniikoiden nähden on että se vähentää työmäärää sekä polunseurannassa että suotimessa. Nämä tekniikat esittelevät toimivan prototyypin katsekeskeisestä polunseurannasta, ja saattavat toimia tienraivaajina laajamittaiselle realistisen reaaliaikaisen polunseurannan käyttöönotolle.Photo-realistic offline rendering is currently done with path tracing, because it naturally produces many real-life light effects such as reflections, refractions and caustics. These effects are hard to achieve with other rendering techniques. However, path tracing in real time is complicated due to its high computational demand. Therefore, current real-time path tracing systems can only generate very noisy estimate of the final frame, which is then denoised with a post-processing reconstruction filter.
A path tracing-based rendering system capable of filling the high resolution in the low latency requirements of mixed reality devices would generate a very immersive user experience. One possible solution for fulfilling these requirements could be foveated path tracing, wherein the rendering resolution is reduced in the periphery of the human visual system. The key challenge is that the foveated path tracing in the periphery is both sparse and noisy, placing high demands on the reconstruction filter.
This thesis proposes the first regression-based reconstruction filter for path tracing that runs in real time. The filter is designed for highly noisy one sample per pixel inputs. The fast execution is accomplished with blockwise processing and fast implementation of the regression. In addition, a novel Visual-Polar coordinate space which distributes the samples according to the contrast sensitivity model of the human visual system is proposed. The specialty of Visual-Polar space is that it reduces both path tracing and reconstruction work because both of them can be done with smaller resolution. These techniques enable a working prototype of a foveated path tracing system and may work as a stepping stone towards wider commercial adoption of photo-realistic real-time path tracing
- …