212 research outputs found
Recommended from our members
Spatially Augmented Reality on Dynamic, Deformable Surfaces and its Applications
Spatially Augmented Reality (SAR), also known as projection mapping, uses multiple projectors to illuminate surfaces of arbitrary shape and size and create seamless, large-scale displays. Traditional SAR assumes that the projection surface is static and rigid. This restriction was partially addressed by Dynamic-SAR, where the surface is rigid and of known shape but can be moved around. However, no prior work has addressed SAR using multiple projectors on deformable surfaces, where the shape is unknown and constantly changing. Thus, multi-projector SAR on deformable surfaces introduces several challenges, including projector-camera calibration on a deformable surface, real-time surface shape recovery and real-time multi-projector warping and blending. My thesis is the first attempt to develop a comprehensive framework for achieving seamless multi-projector displays on deformable surfaces. Furthermore, I will also be presenting its applications in the medical domain to enable remote surgical guidance by using SAR to illuminate surgical stencils on a physical surgical site precisely
Real-Time Implementation of Time-Varying Surface Prediction and Projection
Spatial augmented reality makes use of projectors to transform an object into a display surface. However, for time-varying, non-rigid surfaces this can prove to be difficult, and often leads to image distortion. In order to avoid this highly accurate measurements of the surface are required. Traditional methods of measuring surface deformations are inadequate due to noise as well as potential sources of time delay, such as projector lag. To get more accurate results, a mass spring model can be used to simulate the dynamics of the time-varying surface. This model can be put into a nonlinear state space form to get a first order differential equation. Numerical integration techniques can then be used to solve the differential equation presented.
In order to reduce uncertainty in the model generated a filtering algorithm can be used. Both, the extended Kalman filter (EKF) and the cubature Kalman filter (CKF) are evaluated as potential candidates. To be able to run these filters in real time a reduced order model is developed. This enables the use of fewer mass nodes in the model, allowing for faster compute times. Additionally, to reduce visual error, an optimal node placement algorithm is used. This ensures that the surface generated by the mass spring mesh closely matches the real, curved surface of the system, minimizing error. The EKF and CKF algorithms are implemented onto a hanging cloth system perturbed by an oscillating fan. A parameter identification technique is used to create a model that accurately represents this hanging cloth system. Additionally, noise parameters of the EKF and CKF are adjusted to compensate for modeling errors and sensor noise. Finally, The mean squared error of the EKF and CKF algorithms are compared to evaluate their effectiveness. Both algorithms provide satisfactory results for use in spatial augmented reality applications. However, in all cases tested the CKF is shown to have significantly lower error values.
Although the CKF algorithm is shown to be more accurate than its EKF counterpart, its computation time is much larger. However, the computation time required is still within the threshold of being able to perform real-time estimation at up to 100Hz. Furthermore, due to the nature of the construction of the CKF, it can be applied as a multi-threaded workload to significantly reduce computation time.
Therefore, the implementation of a CKF algorithm can be used to accurately estimate the positions of a measured surface for use in spatial augmented reality
Rekonstruktion, Analyse und Editierung dynamisch deformierter 3D-Oberflächen
Dynamically deforming 3D surfaces play a major role in computer graphics. However, producing time-varying dynamic geometry at ever increasing detail is a time-consuming and costly process, and so a recent trend is to capture geometry data directly from the real world. In the first part of this thesis, I propose novel approaches for this research area. These approaches capture dense dynamic 3D surfaces from multi-camera systems in a particularly robust and accurate way. This provides highly realistic dynamic surface models for phenomena like moving garments and bulging muscles.
However, re-using, editing, or otherwise analyzing dynamic 3D surface data is not yet conveniently possible. To close this gap, the second part of this dissertation develops novel data-driven modeling and animation approaches. I first show a supervised data-driven approach for modeling human muscle deformations that scales to huge datasets and provides fine-scale, anatomically realistic deformations at high quality not attainable by previous methods. I then extend data-driven modeling to the unsupervised case, providing editing tools for a wider set of input data ranging from facial performance capture and full-body motion to muscle and cloth deformation. To this end, I introduce the concepts of sparsity and locality within a mathematical optimization framework. I also explore these concepts for constructing shape-aware functions that are useful for static geometry processing, registration, and localized editing.Dynamisch deformierbare 3D-Oberflächen spielen in der Computergrafik eine zentrale Rolle. Die Erstellung der für Computergrafik-Anwendungen benötigten, hochaufgelösten und zeitlich veränderlichen Oberflächengeometrien ist allerdings äußerst arbeitsintensiv. Aus dieser Problematik heraus hat sich der Trend entwickelt, Oberflächendaten direkt aus Aufnahmen der echten Welt zu erfassen.
Dazu nötige 3D-Rekonstruktionsverfahren werden im ersten Teil der Arbeit entwickelt.
Die vorgestellten, neuartigen Verfahren erlauben die Erfassung dynamischer 3D-Oberflächen aus Mehrkamera-Aufnahmen bei hoher Verlässlichkeit und Präzision.
Auf diese Weise können detaillierte Oberflächenmodelle von Phänomenen wie in Bewegung befindliche Kleidung oder sich anspannende Muskeln erfasst werden.
Aber auch die Wiederverwendung, Bearbeitung und Analyse derlei gewonnener 3D-Oberflächendaten ist aktuell noch nicht auf eine einfache Art und Weise möglich.
Um diese Lücke zu schließen beschäftigt sich der zweite Teil der Arbeit mit der datengetriebenen Modellierung und Animation.
Zunächst wird ein Ansatz für das überwachte Lernen menschlicher Muskel-Deformationen vorgestellt. Dieses neuartige Verfahren ermöglicht eine datengetriebene Modellierung mit besonders umfangreichen Datensätzen und liefert anatomisch-realistische Deformationseffekte. Es übertrifft damit die Genauigkeit früherer Methoden.
Im nächsten Teil beschäftigt sich die Dissertation mit dem unüberwachten Lernen aus 3D-Oberflächendaten. Es werden neuartige Werkzeuge vorgestellt, die eine weitreichende Menge an Eingabedaten verarbeiten können, von aufgenommenen Gesichtsanimationen über Ganzkörperbewegungen bis hin zu Muskel- und Kleidungsdeformationen.
Um diese Anwendungsbreite zu erreichen stützt sich die Arbeit auf die allgemeinen Konzepte der Spärlichkeit und Lokalität und bettet diese in einen mathematischen Optimierungsansatz ein.
Abschließend zeigt die vorliegende Arbeit, wie diese Konzepte auch für die Konstruktion von oberflächen-adaptiven Basisfunktionen übertragen werden können. Dadurch können Anwendungen für die Verarbeitung, Registrierung und Bearbeitung statischer Oberflächenmodelle erschlossen werden
End-to-end Projector Photometric Compensation
Projector photometric compensation aims to modify a projector input image
such that it can compensate for disturbance from the appearance of projection
surface. In this paper, for the first time, we formulate the compensation
problem as an end-to-end learning problem and propose a convolutional neural
network, named CompenNet, to implicitly learn the complex compensation
function. CompenNet consists of a UNet-like backbone network and an autoencoder
subnet. Such architecture encourages rich multi-level interactions between the
camera-captured projection surface image and the input image, and thus captures
both photometric and environment information of the projection surface. In
addition, the visual details and interaction information are carried to deeper
layers along the multi-level skip convolution layers. The architecture is of
particular importance for the projector compensation task, for which only a
small training dataset is allowed in practice. Another contribution we make is
a novel evaluation benchmark, which is independent of system setup and thus
quantitatively verifiable. Such benchmark is not previously available, to our
best knowledge, due to the fact that conventional evaluation requests the
hardware system to actually project the final results. Our key idea, motivated
from our end-to-end problem formulation, is to use a reasonable surrogate to
avoid such projection process so as to be setup-independent. Our method is
evaluated carefully on the benchmark, and the results show that our end-to-end
learning solution outperforms state-of-the-arts both qualitatively and
quantitatively by a significant margin.Comment: To appear in the 2019 IEEE Conference on Computer Vision and Pattern
Recognition (CVPR). Source code and dataset are available at
https://github.com/BingyaoHuang/compenne
Doctor of Philosophy
dissertationThe statistical study of anatomy is one of the primary focuses of medical image analysis. It is well-established that the appropriate mathematical settings for such analyses are Riemannian manifolds and Lie group actions. Statistically defined atlases, in which a mean anatomical image is computed from a collection of static three-dimensional (3D) scans, have become commonplace. Within the past few decades, these efforts, which constitute the field of computational anatomy, have seen great success in enabling quantitative analysis. However, most of the analysis within computational anatomy has focused on collections of static images in population studies. The recent emergence of large-scale longitudinal imaging studies and four-dimensional (4D) imaging technology presents new opportunities for studying dynamic anatomical processes such as motion, growth, and degeneration. In order to make use of this new data, it is imperative that computational anatomy be extended with methods for the statistical analysis of longitudinal and dynamic medical imaging. In this dissertation, the deformable template framework is used for the development of 4D statistical shape analysis, with applications in motion analysis for individualized medicine and the study of growth and disease progression. A new method for estimating organ motion directly from raw imaging data is introduced and tested extensively. Polynomial regression, the staple of curve regression in Euclidean spaces, is extended to the setting of Riemannian manifolds. This polynomial regression framework enables rigorous statistical analysis of longitudinal imaging data. Finally, a new diffeomorphic model of irrotational shape change is presented. This new model presents striking practical advantages over standard diffeomorphic methods, while the study of this new space promises to illuminate aspects of the structure of the diffeomorphism group
CompenNet++: End-to-end Full Projector Compensation
Full projector compensation aims to modify a projector input image such that
it can compensate for both geometric and photometric disturbance of the
projection surface. Traditional methods usually solve the two parts separately,
although they are known to correlate with each other. In this paper, we propose
the first end-to-end solution, named CompenNet++, to solve the two problems
jointly. Our work non-trivially extends CompenNet, which was recently proposed
for photometric compensation with promising performance. First, we propose a
novel geometric correction subnet, which is designed with a cascaded
coarse-to-fine structure to learn the sampling grid directly from photometric
sampling images. Second, by concatenating the geometric correction subset with
CompenNet, CompenNet++ accomplishes full projector compensation and is
end-to-end trainable. Third, after training, we significantly simplify both
geometric and photometric compensation parts, and hence largely improves the
running time efficiency. Moreover, we construct the first setup-independent
full compensation benchmark to facilitate the study on this topic. In our
thorough experiments, our method shows clear advantages over previous arts with
promising compensation quality and meanwhile being practically convenient.Comment: To appear in ICCV 2019. High-res supplementary material:
https://www3.cs.stonybrook.edu/~hling/publication/CompenNet++_sup-high-res.pdf.
Code: https://github.com/BingyaoHuang/CompenNet-plusplu
CylinderTag: An Accurate and Flexible Marker for Cylinder-Shape Objects Pose Estimation Based on Projective Invariants
High-precision pose estimation based on visual markers has been a thriving
research topic in the field of computer vision. However, the suitability of
traditional flat markers on curved objects is limited due to the diverse shapes
of curved surfaces, which hinders the development of high-precision pose
estimation for curved objects. Therefore, this paper proposes a novel visual
marker called CylinderTag, which is designed for developable curved surfaces
such as cylindrical surfaces. CylinderTag is a cyclic marker that can be firmly
attached to objects with a cylindrical shape. Leveraging the manifold
assumption, the cross-ratio in projective invariance is utilized for encoding
in the direction of zero curvature on the surface. Additionally, to facilitate
the usage of CylinderTag, we propose a heuristic search-based marker generator
and a high-performance recognizer as well. Moreover, an all-encompassing
evaluation of CylinderTag properties is conducted by means of extensive
experimentation, covering detection rate, detection speed, dictionary size,
localization jitter, and pose estimation accuracy. CylinderTag showcases
superior detection performance from varying view angles in comparison to
traditional visual markers, accompanied by higher localization accuracy.
Furthermore, CylinderTag boasts real-time detection capability and an extensive
marker dictionary, offering enhanced versatility and practicality in a wide
range of applications. Experimental results demonstrate that the CylinderTag is
a highly promising visual marker for use on cylindrical-like surfaces, thus
offering important guidance for future research on high-precision visual
localization of cylinder-shaped objects. The code is available at:
https://github.com/wsakobe/CylinderTag.Comment: 15 pages, 22 figures. This work has been submitted to the IEEE for
possible publication. Copyright may be transferred without notice, after
which this version may no longer be accessibl
- …