9,786 research outputs found
Solving Inverse Problems with Piecewise Linear Estimators: From Gaussian Mixture Models to Structured Sparsity
A general framework for solving image inverse problems is introduced in this
paper. The approach is based on Gaussian mixture models, estimated via a
computationally efficient MAP-EM algorithm. A dual mathematical interpretation
of the proposed framework with structured sparse estimation is described, which
shows that the resulting piecewise linear estimate stabilizes the estimation
when compared to traditional sparse inverse problem techniques. This
interpretation also suggests an effective dictionary motivated initialization
for the MAP-EM algorithm. We demonstrate that in a number of image inverse
problems, including inpainting, zooming, and deblurring, the same algorithm
produces either equal, often significantly better, or very small margin worse
results than the best published ones, at a lower computational cost.Comment: 30 page
Improvements to context based self-supervised learning
We develop a set of methods to improve on the results of self-supervised
learning using context. We start with a baseline of patch based arrangement
context learning and go from there. Our methods address some overt problems
such as chromatic aberration as well as other potential problems such as
spatial skew and mid-level feature neglect. We prevent problems with testing
generalization on common self-supervised benchmark tests by using different
datasets during our development. The results of our methods combined yield top
scores on all standard self-supervised benchmarks, including classification and
detection on PASCAL VOC 2007, segmentation on PASCAL VOC 2012, and "linear
tests" on the ImageNet and CSAIL Places datasets. We obtain an improvement over
our baseline method of between 4.0 to 7.1 percentage points on transfer
learning classification tests. We also show results on different standard
network architectures to demonstrate generalization as well as portability. All
data, models and programs are available at:
https://gdo-datasci.llnl.gov/selfsupervised/.Comment: Accepted paper at CVPR 201
Learning to Dress {3D} People in Generative Clothing
Three-dimensional human body models are widely used in the analysis of human
pose and motion. Existing models, however, are learned from minimally-clothed
3D scans and thus do not generalize to the complexity of dressed people in
common images and videos. Additionally, current models lack the expressive
power needed to represent the complex non-linear geometry of pose-dependent
clothing shapes. To address this, we learn a generative 3D mesh model of
clothed people from 3D scans with varying pose and clothing. Specifically, we
train a conditional Mesh-VAE-GAN to learn the clothing deformation from the
SMPL body model, making clothing an additional term in SMPL. Our model is
conditioned on both pose and clothing type, giving the ability to draw samples
of clothing to dress different body shapes in a variety of styles and poses. To
preserve wrinkle detail, our Mesh-VAE-GAN extends patchwise discriminators to
3D meshes. Our model, named CAPE, represents global shape and fine local
structure, effectively extending the SMPL body model to clothing. To our
knowledge, this is the first generative model that directly dresses 3D human
body meshes and generalizes to different poses. The model, code and data are
available for research purposes at https://cape.is.tue.mpg.de.Comment: CVPR-2020 camera ready. Code and data are available at
https://cape.is.tue.mpg.d
Application for light field inpainting
Light Field (LF) imaging is a multimedia technology that can provide more immersive experience when visualizing a multimedia content with higher levels of realism compared to conventional imaging technologies. This technology is mainly promising for Virtual Reality (VR) since it displays real-world scenes in a way that users can experience the captured scenes in every position and every angle, due to its 4-dimensional LF representation. For these reasons, LF is a fast-growing technology, with so many topics to explore, being the LF inpainting the one that was explored in this dissertation.
Image inpainting is an editing technique that allows synthesizing alternative content to fill in holes in an image. It is commonly used to fill missing parts in a scene and restore damaged images such that the modifications are correct and visually realistic. Applying traditional 2D inpainting techniques straightforwardly to LFs is very unlikely to result in a consistent inpainting in its all 4 dimensions. Usually, to inpaint a 4D LF content, 2D inpainting algorithms are used to inpaint a particular point of view and then 4D inpainting propagation algorithms propagate the inpainted result for the whole 4D LF data.
Based on this idea of 4D inpainting propagation, some 4D LF inpainting techniques have been recently proposed in the literature. Therefore, this dissertation proposes to design and implement an LF inpainting application that can be used by the public that desire to work in this field and/or explore and edit LFs.Campos de luz é uma tecnologia multimédia que fornece uma experiência mais imersiva ao visualizar conteúdo multimédia com níveis mais altos de realismo, comparando a tecnologias convencionais de imagem. Esta tecnologia é promissora, principalmente para Realidade Virtual, pois exibe cenas capturadas do mundo real de forma que utilizadores as possam experimentar em todas as posições e ângulos, devido à sua representação em 4 dimensões. Por isso, esta é tecnologia em rápido crescimento, com tantos tópicos para explorar, sendo o inpainting o explorado nesta dissertação.
Inpainting de imagens é uma técnica de edição, permitindo sintetizar conteúdo alternativo para preencher lacunas numa imagem. Comumente usado para preencher partes que faltam numa cena e restaurar imagens danificadas, de forma que as modificações sejam corretas e visualmente realistas. É muito improvável que aplicar técnicas tradicionais de inpainting 2D diretamente a campos de luz resulte num inpainting consistente em todas as suas 4 dimensões. Normalmente, para fazer inpainting num conteúdo 4D de campos de luz, os algoritmos de inpainting 2D são usados para fazer inpainting de um ponto de vista específico e, seguidamente, os algoritmos de propagação de inpainting 4D propagam o resultado do inpainting para todos os dados do campo de luz 4D.
Com base nessa ideia de propagação de inpainting 4D, algumas técnicas foram recentemente propostas na literatura. Assim, esta dissertação propõe-se a conceber e implementar uma aplicação de inpainting de campos de luz que possa ser utilizada pelo público que pretenda trabalhar nesta área e/ou explorar e editar campos de luz
Geometric deep learning: going beyond Euclidean data
Many scientific fields study data with an underlying structure that is a
non-Euclidean space. Some examples include social networks in computational
social sciences, sensor networks in communications, functional networks in
brain imaging, regulatory networks in genetics, and meshed surfaces in computer
graphics. In many applications, such geometric data are large and complex (in
the case of social networks, on the scale of billions), and are natural targets
for machine learning techniques. In particular, we would like to use deep
neural networks, which have recently proven to be powerful tools for a broad
range of problems from computer vision, natural language processing, and audio
analysis. However, these tools have been most successful on data with an
underlying Euclidean or grid-like structure, and in cases where the invariances
of these structures are built into networks used to model them. Geometric deep
learning is an umbrella term for emerging techniques attempting to generalize
(structured) deep neural models to non-Euclidean domains such as graphs and
manifolds. The purpose of this paper is to overview different examples of
geometric deep learning problems and present available solutions, key
difficulties, applications, and future research directions in this nascent
field
- …