2,610 research outputs found
Data-Driven Shape Analysis and Processing
Data-driven methods play an increasingly important role in discovering
geometric, structural, and semantic relationships between 3D shapes in
collections, and applying this analysis to support intelligent modeling,
editing, and visualization of geometric data. In contrast to traditional
approaches, a key feature of data-driven approaches is that they aggregate
information from a collection of shapes to improve the analysis and processing
of individual shapes. In addition, they are able to learn models that reason
about properties and relationships of shapes without relying on hard-coded
rules or explicitly programmed instructions. We provide an overview of the main
concepts and components of these techniques, and discuss their application to
shape classification, segmentation, matching, reconstruction, modeling and
exploration, as well as scene analysis and synthesis, through reviewing the
literature and relating the existing works with both qualitative and numerical
comparisons. We conclude our report with ideas that can inspire future research
in data-driven shape analysis and processing.Comment: 10 pages, 19 figure
GASP : Geometric Association with Surface Patches
A fundamental challenge to sensory processing tasks in perception and
robotics is the problem of obtaining data associations across views. We present
a robust solution for ascertaining potentially dense surface patch (superpixel)
associations, requiring just range information. Our approach involves
decomposition of a view into regularized surface patches. We represent them as
sequences expressing geometry invariantly over their superpixel neighborhoods,
as uniquely consistent partial orderings. We match these representations
through an optimal sequence comparison metric based on the Damerau-Levenshtein
distance - enabling robust association with quadratic complexity (in contrast
to hitherto employed joint matching formulations which are NP-complete). The
approach is able to perform under wide baselines, heavy rotations, partial
overlaps, significant occlusions and sensor noise.
The technique does not require any priors -- motion or otherwise, and does
not make restrictive assumptions on scene structure and sensor movement. It
does not require appearance -- is hence more widely applicable than appearance
reliant methods, and invulnerable to related ambiguities such as textureless or
aliased content. We present promising qualitative and quantitative results
under diverse settings, along with comparatives with popular approaches based
on range as well as RGB-D data.Comment: International Conference on 3D Vision, 201
Graph Spectral Image Processing
Recent advent of graph signal processing (GSP) has spurred intensive studies
of signals that live naturally on irregular data kernels described by graphs
(e.g., social networks, wireless sensor networks). Though a digital image
contains pixels that reside on a regularly sampled 2D grid, if one can design
an appropriate underlying graph connecting pixels with weights that reflect the
image structure, then one can interpret the image (or image patch) as a signal
on a graph, and apply GSP tools for processing and analysis of the signal in
graph spectral domain. In this article, we overview recent graph spectral
techniques in GSP specifically for image / video processing. The topics covered
include image compression, image restoration, image filtering and image
segmentation
PSACNN: Pulse Sequence Adaptive Fast Whole Brain Segmentation
With the advent of convolutional neural networks~(CNN), supervised learning
methods are increasingly being used for whole brain segmentation. However, a
large, manually annotated training dataset of labeled brain images required to
train such supervised methods is frequently difficult to obtain or create. In
addition, existing training datasets are generally acquired with a homogeneous
magnetic resonance imaging~(MRI) acquisition protocol. CNNs trained on such
datasets are unable to generalize on test data with different acquisition
protocols. Modern neuroimaging studies and clinical trials are necessarily
multi-center initiatives with a wide variety of acquisition protocols. Despite
stringent protocol harmonization practices, it is very difficult to standardize
the gamut of MRI imaging parameters across scanners, field strengths, receive
coils etc., that affect image contrast. In this paper we propose a CNN-based
segmentation algorithm that, in addition to being highly accurate and fast, is
also resilient to variation in the input acquisition. Our approach relies on
building approximate forward models of pulse sequences that produce a typical
test image. For a given pulse sequence, we use its forward model to generate
plausible, synthetic training examples that appear as if they were acquired in
a scanner with that pulse sequence. Sampling over a wide variety of pulse
sequences results in a wide variety of augmented training examples that help
build an image contrast invariant model. Our method trains a single CNN that
can segment input MRI images with acquisition parameters as disparate as
-weighted and -weighted contrasts with only -weighted training
data. The segmentations generated are highly accurate with state-of-the-art
results~(overall Dice overlap), with a fast run time~( 45
seconds), and consistent across a wide range of acquisition protocols.Comment: Typo in author name corrected. Greves -> Grev
Reconstructive Sparse Code Transfer for Contour Detection and Semantic Labeling
We frame the task of predicting a semantic labeling as a sparse
reconstruction procedure that applies a target-specific learned transfer
function to a generic deep sparse code representation of an image. This
strategy partitions training into two distinct stages. First, in an
unsupervised manner, we learn a set of generic dictionaries optimized for
sparse coding of image patches. We train a multilayer representation via
recursive sparse dictionary learning on pooled codes output by earlier layers.
Second, we encode all training images with the generic dictionaries and learn a
transfer function that optimizes reconstruction of patches extracted from
annotated ground-truth given the sparse codes of their corresponding image
patches. At test time, we encode a novel image using the generic dictionaries
and then reconstruct using the transfer function. The output reconstruction is
a semantic labeling of the test image.
Applying this strategy to the task of contour detection, we demonstrate
performance competitive with state-of-the-art systems. Unlike almost all prior
work, our approach obviates the need for any form of hand-designed features or
filters. To illustrate general applicability, we also show initial results on
semantic part labeling of human faces.
The effectiveness of our approach opens new avenues for research on deep
sparse representations. Our classifiers utilize this representation in a novel
manner. Rather than acting on nodes in the deepest layer, they attach to nodes
along a slice through multiple layers of the network in order to make
predictions about local patches. Our flexible combination of a generatively
learned sparse representation with discriminatively trained transfer
classifiers extends the notion of sparse reconstruction to encompass arbitrary
semantic labeling tasks.Comment: to appear in Asian Conference on Computer Vision (ACCV), 201
Motion Deblurring in the Wild
The task of image deblurring is a very ill-posed problem as both the image
and the blur are unknown. Moreover, when pictures are taken in the wild, this
task becomes even more challenging due to the blur varying spatially and the
occlusions between the object. Due to the complexity of the general image model
we propose a novel convolutional network architecture which directly generates
the sharp image.This network is built in three stages, and exploits the
benefits of pyramid schemes often used in blind deconvolution. One of the main
difficulties in training such a network is to design a suitable dataset. While
useful data can be obtained by synthetically blurring a collection of images,
more realistic data must be collected in the wild. To obtain such data we use a
high frame rate video camera and keep one frame as the sharp image and frame
average as the corresponding blurred image. We show that this realistic dataset
is key in achieving state-of-the-art performance and dealing with occlusions
- …