5,979 research outputs found
Learning Generative Models of Shape Handles
We present a generative model to synthesize 3D shapes as sets of handles --
lightweight proxies that approximate the original 3D shape -- for applications
in interactive editing, shape parsing, and building compact 3D representations.
Our model can generate handle sets with varying cardinality and different types
of handles (Figure 1). Key to our approach is a deep architecture that predicts
both the parameters and existence of shape handles, and a novel similarity
measure that can easily accommodate different types of handles, such as cuboids
or sphere-meshes. We leverage the recent advances in semantic 3D annotation as
well as automatic shape summarizing techniques to supervise our approach. We
show that the resulting shape representations are intuitive and achieve
superior quality than previous state-of-the-art. Finally, we demonstrate how
our method can be used in applications such as interactive shape editing,
completion, and interpolation, leveraging the latent space learned by our model
to guide these tasks. Project page: http://mgadelha.me/shapehandles.Comment: 11 pages, 11 figures, accepted do CVPR 202
Modeling Waveform Shapes with Random Eects Segmental Hidden Markov Models
In this paper we describe a general probabilistic framework for modeling
waveforms such as heartbeats from ECG data. The model is based on segmental
hidden Markov models (as used in speech recognition) with the addition of
random effects to the generative model. The random effects component of the
model handles shape variability across different waveforms within a general
class of waveforms of similar shape. We show that this probabilistic model
provides a unified framework for learning these models from sets of waveform
data as well as parsing, classification, and prediction of new waveforms. We
derive a computationally efficient EM algorithm to fit the model on multiple
waveforms, and introduce a scoring method that evaluates a test waveform based
on its shape. Results on two real-world data sets demonstrate that the random
effects methodology leads to improved accuracy (compared to alternative
approaches) on classification and segmentation of real-world waveforms.Comment: Appears in Proceedings of the Twentieth Conference on Uncertainty in
Artificial Intelligence (UAI2004
DeepCloud. The Application of a Data-driven, Generative Model in Design
Generative systems have a significant potential to synthesize innovative
design alternatives. Still, most of the common systems that have been adopted
in design require the designer to explicitly define the specifications of the
procedures and in some cases the design space. In contrast, a generative system
could potentially learn both aspects through processing a database of existing
solutions without the supervision of the designer. To explore this possibility,
we review recent advancements of generative models in machine learning and
current applications of learning techniques in design. Then, we describe the
development of a data-driven generative system titled DeepCloud. It combines an
autoencoder architecture for point clouds with a web-based interface and analog
input devices to provide an intuitive experience for data-driven generation of
design alternatives. We delineate the implementation of two prototypes of
DeepCloud, their contributions, and potentials for generative design
Points2Pix: 3D Point-Cloud to Image Translation using conditional Generative Adversarial Networks
We present the first approach for 3D point-cloud to image translation based
on conditional Generative Adversarial Networks (cGAN). The model handles
multi-modal information sources from different domains, i.e. raw point-sets and
images. The generator is capable of processing three conditions, whereas the
point-cloud is encoded as raw point-set and camera projection. An image
background patch is used as constraint to bias environmental texturing. A
global approximation function within the generator is directly applied on the
point-cloud (Point-Net). Hence, the representative learning model incorporates
global 3D characteristics directly at the latent feature space. Conditions are
used to bias the background and the viewpoint of the generated image. This
opens up new ways in augmenting or texturing 3D data to aim the generation of
fully individual images. We successfully evaluated our method on the Kitti and
SunRGBD dataset with an outstanding object detection inception score
SilNet : Single- and Multi-View Reconstruction by Learning from Silhouettes
The objective of this paper is 3D shape understanding from single and
multiple images. To this end, we introduce a new deep-learning architecture and
loss function, SilNet, that can handle multiple views in an order-agnostic
manner. The architecture is fully convolutional, and for training we use a
proxy task of silhouette prediction, rather than directly learning a mapping
from 2D images to 3D shape as has been the target in most recent work.
We demonstrate that with the SilNet architecture there is generalisation over
the number of views -- for example, SilNet trained on 2 views can be used with
3 or 4 views at test-time; and performance improves with more views.
We introduce two new synthetics datasets: a blobby object dataset useful for
pre-training, and a challenging and realistic sculpture dataset; and
demonstrate on these datasets that SilNet has indeed learnt 3D shape. Finally,
we show that SilNet exceeds the state of the art on the ShapeNet benchmark
dataset, and use SilNet to generate novel views of the sculpture dataset.Comment: BMVC 2017; Best Poste
A Bayesian Tensor Factorization Model via Variational Inference for Link Prediction
Probabilistic approaches for tensor factorization aim to extract meaningful
structure from incomplete data by postulating low rank constraints. Recently,
variational Bayesian (VB) inference techniques have successfully been applied
to large scale models. This paper presents full Bayesian inference via VB on
both single and coupled tensor factorization models. Our method can be run even
for very large models and is easily implemented. It exhibits better prediction
performance than existing approaches based on maximum likelihood on several
real-world datasets for missing link prediction problem.Comment: arXiv admin note: substantial text overlap with arXiv:1409.808
PointGrow: Autoregressively Learned Point Cloud Generation with Self-Attention
Generating 3D point clouds is challenging yet highly desired. This work
presents a novel autoregressive model, PointGrow, which can generate diverse
and realistic point cloud samples from scratch or conditioned on semantic
contexts. This model operates recurrently, with each point sampled according to
a conditional distribution given its previously-generated points, allowing
inter-point correlations to be well-exploited and 3D shape generative processes
to be better interpreted. Since point cloud object shapes are typically encoded
by long-range dependencies, we augment our model with dedicated self-attention
modules to capture such relations. Extensive evaluations show that PointGrow
achieves satisfying performance on both unconditional and conditional point
cloud generation tasks, with respect to realism and diversity. Several
important applications, such as unsupervised feature learning and shape
arithmetic operations, are also demonstrated
Fast Tracking via Spatio-Temporal Context Learning
In this paper, we present a simple yet fast and robust algorithm which
exploits the spatio-temporal context for visual tracking. Our approach
formulates the spatio-temporal relationships between the object of interest and
its local context based on a Bayesian framework, which models the statistical
correlation between the low-level features (i.e., image intensity and position)
from the target and its surrounding regions. The tracking problem is posed by
computing a confidence map, and obtaining the best target location by
maximizing an object location likelihood function. The Fast Fourier Transform
is adopted for fast learning and detection in this work. Implemented in MATLAB
without code optimization, the proposed tracker runs at 350 frames per second
on an i7 machine. Extensive experimental results show that the proposed
algorithm performs favorably against state-of-the-art methods in terms of
efficiency, accuracy and robustness
3D-LMNet: Latent Embedding Matching for Accurate and Diverse 3D Point Cloud Reconstruction from a Single Image
3D reconstruction from single view images is an ill-posed problem. Inferring
the hidden regions from self-occluded images is both challenging and ambiguous.
We propose a two-pronged approach to address these issues. To better
incorporate the data prior and generate meaningful reconstructions, we propose
3D-LMNet, a latent embedding matching approach for 3D reconstruction. We first
train a 3D point cloud auto-encoder and then learn a mapping from the 2D image
to the corresponding learnt embedding. To tackle the issue of uncertainty in
the reconstruction, we predict multiple reconstructions that are consistent
with the input view. This is achieved by learning a probablistic latent space
with a novel view-specific diversity loss. Thorough quantitative and
qualitative analysis is performed to highlight the significance of the proposed
approach. We outperform state-of-the-art approaches on the task of single-view
3D reconstruction on both real and synthetic datasets while generating multiple
plausible reconstructions, demonstrating the generalizability and utility of
our approach.Comment: Accepted at BMVC 2018; Codes are available at
https://github.com/val-iisc/3d-lmne
Few-shot Compositional Font Generation with Dual Memory
Generating a new font library is a very labor-intensive and time-consuming
job for glyph-rich scripts. Despite the remarkable success of existing font
generation methods, they have significant drawbacks; they require a large
number of reference images to generate a new font set, or they fail to capture
detailed styles with only a few samples. In this paper, we focus on
compositional scripts, a widely used letter system in the world, where each
glyph can be decomposed by several components. By utilizing the
compositionality of compositional scripts, we propose a novel font generation
framework, named Dual Memory-augmented Font Generation Network (DM-Font), which
enables us to generate a high-quality font library with only a few samples. We
employ memory components and global-context awareness in the generator to take
advantage of the compositionality. In the experiments on Korean-handwriting
fonts and Thai-printing fonts, we observe that our method generates a
significantly better quality of samples with faithful stylization compared to
the state-of-the-art generation methods quantitatively and qualitatively.
Source code is available at https://github.com/clovaai/dmfont.Comment: ECCV 2020 camera-read
- …