412 research outputs found
Learning Single-Image Depth from Videos using Quality Assessment Networks
Depth estimation from a single image in the wild remains a challenging
problem. One main obstacle is the lack of high-quality training data for images
in the wild. In this paper we propose a method to automatically generate such
data through Structure-from-Motion (SfM) on Internet videos. The core of this
method is a Quality Assessment Network that identifies high-quality
reconstructions obtained from SfM. Using this method, we collect single-view
depth training data from a large number of YouTube videos and construct a new
dataset called YouTube3D. Experiments show that YouTube3D is useful in training
depth estimation networks and advances the state of the art of single-view
depth estimation in the wild
Automatic Reconstruction of Parametric, Volumetric Building Models from 3D Point Clouds
Planning, construction, modification, and analysis of buildings requires means of representing a building's physical structure and related semantics in a meaningful way. With the rise of novel technologies and increasing requirements in the architecture, engineering and construction (AEC) domain, two general concepts for representing buildings have gained particular attention in recent years. First, the concept of Building Information Modeling (BIM) is increasingly used as a modern means for representing and managing a building's as-planned state digitally, including not only a geometric model but also various additional semantic properties. Second, point cloud measurements are now widely used for capturing a building's as-built condition by means of laser scanning techniques. A particular challenge and topic of current research are methods for combining the strengths of both point cloud measurements and Building Information Modeling concepts to quickly obtain accurate building models from measured data. In this thesis, we present our recent approaches to tackle the intermeshed challenges of automated indoor point cloud interpretation using targeted segmentation methods, and the automatic reconstruction of high-level, parametric and volumetric building models as the basis for further usage in BIM scenarios. In contrast to most reconstruction methods available at the time, we fundamentally base our approaches on BIM principles and standards, and overcome critical limitations of previous approaches in order to reconstruct globally plausible, volumetric, and parametric models.Automatische Rekonstruktion von parametrischen, volumetrischen Gebäudemodellen aus 3D Punktwolken Für die Planung, Konstruktion, Modifikation und Analyse von Gebäuden werden Möglichkeiten zur sinnvollen Repräsentation der physischen Gebäudestruktur sowie dazugehöriger Semantik benötigt. Mit dem Aufkommen neuer Technologien und steigenden Anforderungen im Bereich von Architecture, Engineering and Construction (AEC) haben zwei Konzepte für die Repräsentation von Gebäuden in den letzten Jahren besondere Aufmerksamkeit erlangt. Erstens wird das Konzept des Building Information Modeling (BIM) zunehmend als ein modernes Mittel zur digitalen Abbildung und Verwaltung "As-Planned"-Zustands von Gebäuden verwendet, welches nicht nur ein geometrisches Modell sondern auch verschiedene zusätzliche semantische Eigenschaften beinhaltet. Zweitens werden Punktwolkenmessungen inzwischen häufig zur Aufnahme des "As-Built"-Zustands mittels Laser-Scan-Techniken eingesetzt. Eine besondere Herausforderung und Thema aktueller Forschung ist die Entwicklung von Methoden zur Vereinigung der Stärken von Punktwolken und Konzepten des Building Information Modeling um schnell akkurate Gebäudemodelle aus den gemessenen Daten zu erzeugen. In dieser Dissertation präsentieren wir unsere aktuellen Ansätze um die miteinander verwobenen Herausforderungen anzugehen, Punktwolken mithilfe geeigneter Segmentierungsmethoden automatisiert zu interpretieren, sowie hochwertige, parametrische und volumetrische Gebäudemodelle als Basis für die Verwendung im BIM-Umfeld zu rekonstruieren. Im Gegensatz zu den meisten derzeit verfügbaren Rekonstruktionsverfahren basieren unsere Ansätze grundlegend auf Prinzipien und Standards aus dem BIM-Umfeld und überwinden kritische Einschränkungen bisheriger Ansätze um vollständig plausible, volumetrische und parametrische Modelle zu erzeugen.</p
3D Point Capsule Networks
In this paper, we propose 3D point-capsule networks, an auto-encoder designed
to process sparse 3D point clouds while preserving spatial arrangements of the
input data. 3D capsule networks arise as a direct consequence of our novel
unified 3D auto-encoder formulation. Their dynamic routing scheme and the
peculiar 2D latent space deployed by our approach bring in improvements for
several common point cloud-related tasks, such as object classification, object
reconstruction and part segmentation as substantiated by our extensive
evaluations. Moreover, it enables new applications such as part interpolation
and replacement.Comment: As published in CVPR 2019 (camera ready version), with supplementary
materia
3D Point Capsule Networks
In this paper, we propose 3D point-capsule networks, an auto-encoder designed
to process sparse 3D point clouds while preserving spatial arrangements of the
input data. 3D capsule networks arise as a direct consequence of our novel
unified 3D auto-encoder formulation. Their dynamic routing scheme and the
peculiar 2D latent space deployed by our approach bring in improvements for
several common point cloud-related tasks, such as object classification, object
reconstruction and part segmentation as substantiated by our extensive
evaluations. Moreover, it enables new applications such as part interpolation
and replacement
The multimodal Ganzfeld-induced altered state of consciousness induces decreased thalamo-cortical coupling
Different pharmacologic agents have been used to investigate the neuronal underpinnings of alterations in consciousness states, such as psychedelic substances. Special attention has been drawn to the role of thalamic filtering of cortical input. Here, we investigate the neuronal mechanisms underlying an altered state of consciousness (ASC) induced by a non-pharmacological procedure. During fMRI scanning, N=19 human participants were exposed to multimodal Ganzfeld stimulation, a technique of perceptual deprivation where participants are exposed to intense, unstructured, homogenous visual and auditory stimulation. Compared to pre- and post-resting-state scans, the Ganzfeld data displayed a progressive decoupling of the thalamus from the cortex. Furthermore, the Ganzfeld-induced ASC was characterized by increased eigenvector centrality in core regions of the default mode network (DMN). Together, these findings can be interpreted as an imbalance of sensory bottom-up signaling and internally-generated top-down signaling. This imbalance is antithetical to psychedelic-induced ASCs, where increased thalamo-cortical coupling and reduced DMN activity were observed
Learning Human Motion Models for Long-term Predictions
We propose a new architecture for the learning of predictive spatio-temporal
motion models from data alone. Our approach, dubbed the Dropout Autoencoder
LSTM, is capable of synthesizing natural looking motion sequences over long
time horizons without catastrophic drift or motion degradation. The model
consists of two components, a 3-layer recurrent neural network to model
temporal aspects and a novel auto-encoder that is trained to implicitly recover
the spatial structure of the human skeleton via randomly removing information
about joints during training time. This Dropout Autoencoder (D-AE) is then used
to filter each predicted pose of the LSTM, reducing accumulation of error and
hence drift over time. Furthermore, we propose new evaluation protocols to
assess the quality of synthetic motion sequences even for which no ground truth
data exists. The proposed protocols can be used to assess generated sequences
of arbitrary length. Finally, we evaluate our proposed method on two of the
largest motion-capture datasets available to date and show that our model
outperforms the state-of-the-art on a variety of actions, including cyclic and
acyclic motion, and that it can produce natural looking sequences over longer
time horizons than previous methods
ClusterNet: A Perception-Based Clustering Model for Scattered Data
Visualizations for scattered data are used to make users understand certain
attributes of their data by solving different tasks, e.g. correlation
estimation, outlier detection, cluster separation. In this paper, we focus on
the later task, and develop a technique that is aligned to human perception,
that can be used to understand how human subjects perceive clusterings in
scattered data and possibly optimize for better understanding. Cluster
separation in scatterplots is a task that is typically tackled by widely used
clustering techniques, such as for instance k-means or DBSCAN. However, as
these algorithms are based on non-perceptual metrics, we can show in our
experiments, that their output do not reflect human cluster perception. We
propose a learning strategy which directly operates on scattered data. To learn
perceptual cluster separation on this data, we crowdsourced a large scale
dataset, consisting of 7,320 point-wise cluster affiliations for bivariate
data, which has been labeled by 384 human crowd workers. Based on this data, we
were able to train ClusterNet, a point-based deep learning model, trained to
reflect human perception of cluster separability. In order to train ClusterNet
on human annotated data, we use a PointNet++ architecture enabling inference on
point clouds directly. In this work, we provide details on how we collected our
dataset, report statistics of the resulting annotations, and investigate
perceptual agreement of cluster separation for real-world data. We further
report the training and evaluation protocol of ClusterNet and introduce a novel
metric, that measures the accuracy between a clustering technique and a group
of human annotators. Finally, we compare our approach against existing
state-of-the-art clustering techniques and can show, that ClusterNet is able to
generalize to unseen and out of scope data.Comment: Currently, this manuscript is under revision at TVC
- …