384 research outputs found
Temporally coherent 4D reconstruction of complex dynamic scenes
This paper presents an approach for reconstruction of 4D temporally coherent
models of complex dynamic scenes. No prior knowledge is required of scene
structure or camera calibration allowing reconstruction from multiple moving
cameras. Sparse-to-dense temporal correspondence is integrated with joint
multi-view segmentation and reconstruction to obtain a complete 4D
representation of static and dynamic objects. Temporal coherence is exploited
to overcome visual ambiguities resulting in improved reconstruction of complex
scenes. Robust joint segmentation and reconstruction of dynamic objects is
achieved by introducing a geodesic star convexity constraint. Comparative
evaluation is performed on a variety of unstructured indoor and outdoor dynamic
scenes with hand-held cameras and multiple people. This demonstrates
reconstruction of complete temporally coherent 4D scene models with improved
nonrigid object segmentation and shape reconstruction.Comment: To appear in The IEEE Conference on Computer Vision and Pattern
Recognition (CVPR) 2016 . Video available at:
https://www.youtube.com/watch?v=bm_P13_-Ds
Unsupervised Discovery of Extreme Weather Events Using Universal Representations of Emergent Organization
Spontaneous self-organization is ubiquitous in systems far from thermodynamic
equilibrium. While organized structures that emerge dominate transport
properties, universal representations that identify and describe these key
objects remain elusive. Here, we introduce a theoretically-grounded framework
for describing emergent organization that, via data-driven algorithms, is
constructive in practice. Its building blocks are spacetime lightcones that
embody how information propagates across a system through local interactions.
We show that predictive equivalence classes of lightcones -- local causal
states -- capture organized behaviors and coherent structures in complex
spatiotemporal systems. Employing an unsupervised physics-informed machine
learning algorithm and a high-performance computing implementation, we
demonstrate automatically discovering coherent structures in two real world
domain science problems. We show that local causal states identify vortices and
track their power-law decay behavior in two-dimensional fluid turbulence. We
then show how to detect and track familiar extreme weather events -- hurricanes
and atmospheric rivers -- and discover other novel coherent structures
associated with precipitation extremes in high-resolution climate data at the
grid-cell level
Spatiotemporal oriented energies for spacetime stereo
This paper presents a novel approach to recovering tem-porally coherent estimates of 3D structure of a dynamic scene from a sequence of binocular stereo images. The approach is based on matching spatiotemporal orientation distributions between left and right temporal image streams, which encapsulates both local spatial and temporal struc-ture for disparity estimation. By capturing spatial and tem-poral structure in this unified fashion, both sources of in-formation combine to yield disparity estimates that are nat-urally temporal coherent, while helping to resolve matches that might be ambiguous when either source is considered alone. Further, by allowing subsets of the orientation mea-surements to support different disparity estimates, an ap-proach to recovering multilayer disparity from spacetime stereo is realized. The approach has been implemented with real-time performance on commodity GPUs. Empir-ical evaluation shows that the approach yields qualitatively and quantitatively superior disparity estimates in compari-son to various alternative approaches, including the ability to provide accurate multilayer estimates in the presence of (semi)transparent and specular surfaces. 1
Towards Unsupervised Segmentation of Extreme Weather Events
Extreme weather is one of the main mechanisms through which climate change
will directly impact human society. Coping with such change as a global
community requires markedly improved understanding of how global warming drives
extreme weather events. While alternative climate scenarios can be simulated
using sophisticated models, identifying extreme weather events in these
simulations requires automation due to the vast amounts of complex
high-dimensional data produced. Atmospheric dynamics, and hydrodynamic flows
more generally, are highly structured and largely organize around a lower
dimensional skeleton of coherent structures. Indeed, extreme weather events are
a special case of more general hydrodynamic coherent structures. We present a
scalable physics-based representation learning method that decomposes
spatiotemporal systems into their structurally relevant components, which are
captured by latent variables known as local causal states. For complex fluid
flows we show our method is capable of capturing known coherent structures, and
with promising segmentation results on CAM5.1 water vapor data we outline the
path to extreme weather identification from unlabeled climate model simulation
data
The ngEHT's Role in Measuring Supermassive Black Hole Spins
While supermassive black hole masses have been cataloged across cosmic time,
only a few dozen of them have robust spin measurements. By extending and
improving the existing Event Horizon Telescope (EHT) array, the next-generation
Event Horizon Telescope (ngEHT) will enable multifrequency, polarimetric movies
on event horizon scales, which will place new constraints on the space-time and
accretion flow. By combining this information, it is anticipated that the ngEHT
may be able to measure tens of supermassive black hole masses and spins. In
this white paper, we discuss existing spin measurements and many proposed
techniques with which the ngEHT could potentially measure spins of target
supermassive black holes. Spins measured by the ngEHT would represent a
completely new sample of sources that, unlike pre-existing samples, would not
be biased towards objects with high accretion rates. Such a sample would
provide new insights into the accretion, feedback, and cosmic assembly of
supermassive black holes.Comment: Submitted for Galaxies Special Issue "From Vision to Instrument:
Creating a Next-Generation Event Horizon Telescope for a New Era of Black
Hole Science
Temporally Coherent General Dynamic Scene Reconstruction
Existing techniques for dynamic scene reconstruction from multiple
wide-baseline cameras primarily focus on reconstruction in controlled
environments, with fixed calibrated cameras and strong prior constraints. This
paper introduces a general approach to obtain a 4D representation of complex
dynamic scenes from multi-view wide-baseline static or moving cameras without
prior knowledge of the scene structure, appearance, or illumination.
Contributions of the work are: An automatic method for initial coarse
reconstruction to initialize joint estimation; Sparse-to-dense temporal
correspondence integrated with joint multi-view segmentation and reconstruction
to introduce temporal coherence; and a general robust approach for joint
segmentation refinement and dense reconstruction of dynamic scenes by
introducing shape constraint. Comparison with state-of-the-art approaches on a
variety of complex indoor and outdoor scenes, demonstrates improved accuracy in
both multi-view segmentation and dense reconstruction. This paper demonstrates
unsupervised reconstruction of complete temporally coherent 4D scene models
with improved non-rigid object segmentation and shape reconstruction and its
application to free-viewpoint rendering and virtual reality.Comment: Submitted to IJCV 2019. arXiv admin note: substantial text overlap
with arXiv:1603.0338
Non-Rigid Neural Radiance Fields: Reconstruction and Novel View Synthesis of a Deforming Scene from Monocular Video
In this tech report, we present the current state of our ongoing work on reconstructing Neural Radiance Fields (NERF) of general non-rigid scenes via ray bending. Non-rigid NeRF (NR-NeRF) takes RGB images of a deforming object (e.g., from a monocular video) as input and then learns a geometry and appearance representation that not only allows to reconstruct the input sequence but also to re-render any time step into novel camera views with high fidelity. In particular, we show that a consumer-grade camera is sufficient to synthesize convincing bullet-time videos of short and simple scenes. In addition, the resulting representation enables correspondence estimation across views and time, and provides rigidity scores for each point in the scene. We urge the reader to watch the supplemental videos for qualitative results. We will release our code
FaceVR: Real-Time Facial Reenactment and Eye Gaze Control in Virtual Reality
We introduce FaceVR, a novel method for gaze-aware facial reenactment in the Virtual Reality (VR) context. The key component of FaceVR is a robust algorithm to perform real-time facial motion capture of an actor who is wearing a head-mounted display (HMD), as well as a new data-driven approach for eye tracking from monocular videos. In addition to these face reconstruction components, FaceVR incorporates photo-realistic re-rendering in real time, thus allowing artificial modifications of face and eye appearances. For instance, we can alter facial expressions, change gaze directions, or remove the VR goggles in realistic re-renderings. In a live setup with a source and a target actor, we apply these newly-introduced algorithmic components. We assume that the source actor is wearing a VR device, and we capture his facial expressions and eye movement in real-time. For the target video, we mimic a similar tracking process; however, we use the source input to drive the animations of the target video, thus enabling gaze-aware facial reenactment. To render the modified target video on a stereo display, we augment our capture and reconstruction process with stereo data. In the end, FaceVR produces compelling results for a variety of applications, such as gaze-aware facial reenactment, reenactment in virtual reality, removal of VR goggles, and re-targeting of somebody's gaze direction in a video conferencing call
High quality dynamic reflectance and surface reconstruction from video
The creation of high quality animations of real-world human actors has long been a challenging problem in computer graphics. It involves the modeling of the shape of the virtual actors, creating their motion, and the reproduction of very fine dynamic details. In order to render the actor under arbitrary lighting, it is required that reflectance properties are modeled for each point on the surface. These steps, that are usually performed manually by professional modelers, are time consuming and cumbersome.
In this thesis, we show that algorithmic solutions for some of the problems that arise in the creation of high quality animation of real-world people are possible using multi-view video data. First, we present a novel spatio-temporal approach to create a personalized avatar from multi-view video data of a moving person. Thereafter, we propose two enhancements to a method that captures human shape, motion and reflectance properties of amoving human using eightmulti-view video streams. Afterwards we extend this work, and in order to add very fine dynamic details to the geometric models, such as wrinkles and folds in the clothing, we make use of the multi-view video recordings and present a statistical method that can passively capture the fine-grain details of time-varying scene geometry. Finally, in order to reconstruct structured shape and animation of the subject from video, we present a dense 3D correspondence finding method that enables spatiotemporally coherent reconstruction of surface animations directly frommulti-view video data.
These algorithmic solutions can be combined to constitute a complete animation pipeline for acquisition, reconstruction and rendering of high quality virtual actors from multi-view video data. They can also be used individually in a system that require the solution of a specific algorithmic sub-problem. The results demonstrate that using multi-view video data it is possible to find the model description that enables realistic appearance of animated virtual actors under different lighting conditions and exhibits high quality dynamic details in the geometry.Die Entwicklung hochqualitativer Animationen von menschlichen Schauspielern ist seit langem ein schwieriges Problem in der Computergrafik. Es beinhaltet das Modellieren einer dreidimensionaler Abbildung des Akteurs, seiner Bewegung und die Wiedergabe sehr feiner dynamischer Details. Um den Schauspieler unter einer beliebigen Beleuchtung zu rendern, müssen auch die Reflektionseigenschaften jedes einzelnen Punktes modelliert werden. Diese Schritte, die gewöhnlich manuell von Berufsmodellierern durchgeführt werden, sind zeitaufwendig und beschwerlich.
In dieser These schlagen wir algorithmische Lösungen für einige der Probleme vor, die in der Entwicklung solch hochqualitativen Animationen entstehen. Erstens präsentieren wir einen neuartigen, räumlich-zeitlichen Ansatz um einen Avatar von Mehransicht-Videodaten einer bewegenden Person zu schaffen. Danach beschreiben wir einen videobasierten Modelierungsansatz mit Hilfe einer animierten Schablone eines menschlichen Körpers. Unter Zuhilfenahme einer handvoll synchronisierter Videoaufnahmen berechnen wir die dreidimensionale Abbildung, seine Bewegung und Reflektionseigenschaften der Oberfläche. Um sehr feine dynamische Details, wie Runzeln und Falten in der Kleidung zu den geometrischen Modellen hinzuzufügen, zeigen wir eine statistische Methode, die feinen Details der zeitlich variierenden Szenegeometrie passiv erfassen kann. Und schließlich zeigen wir eine Methode, die dichte 3D Korrespondenzen findet, um die strukturierte Abbildung und die zugehörige Bewegung aus einem Video zu extrahieren. Dies ermöglicht eine räumlich-zeitlich zusammenhängende Rekonstruktion von Oberflächenanimationen direkt aus Mehransicht-Videodaten.
Diese algorithmischen Lösungen können kombiniert eingesetzt werden, um eine Animationspipeline für die Erfassung, die Rekonstruktion und das Rendering von Animationen hoher Qualität aus Mehransicht-Videodaten zu ermöglichen. Sie können auch einzeln in einem System verwendet werden, das nach einer Lösung eines spezifischen algorithmischen Teilproblems verlangt. Das Ergebnis ist eine Modelbeschreibung, das realistisches Erscheinen von animierten virtuellen Schauspielern mit dynamischen Details von hoher Qualität unter verschiedenen Lichtverhältnissen ermöglicht
- …