40,500 research outputs found
Probabilistic Label Relation Graphs with Ising Models
We consider classification problems in which the label space has structure. A
common example is hierarchical label spaces, corresponding to the case where
one label subsumes another (e.g., animal subsumes dog). But labels can also be
mutually exclusive (e.g., dog vs cat) or unrelated (e.g., furry, carnivore). To
jointly model hierarchy and exclusion relations, the notion of a HEX (hierarchy
and exclusion) graph was introduced in [7]. This combined a conditional random
field (CRF) with a deep neural network (DNN), resulting in state of the art
results when applied to visual object classification problems where the
training labels were drawn from different levels of the ImageNet hierarchy
(e.g., an image might be labeled with the basic level category "dog", rather
than the more specific label "husky"). In this paper, we extend the HEX model
to allow for soft or probabilistic relations between labels, which is useful
when there is uncertainty about the relationship between two labels (e.g., an
antelope is "sort of" furry, but not to the same degree as a grizzly bear). We
call our new model pHEX, for probabilistic HEX. We show that the pHEX graph can
be converted to an Ising model, which allows us to use existing off-the-shelf
inference methods (in contrast to the HEX method, which needed specialized
inference algorithms). Experimental results show significant improvements in a
number of large-scale visual object classification tasks, outperforming the
previous HEX model.Comment: International Conference on Computer Vision (2015
ScanComplete: Large-Scale Scene Completion and Semantic Segmentation for 3D Scans
We introduce ScanComplete, a novel data-driven approach for taking an
incomplete 3D scan of a scene as input and predicting a complete 3D model along
with per-voxel semantic labels. The key contribution of our method is its
ability to handle large scenes with varying spatial extent, managing the cubic
growth in data size as scene size increases. To this end, we devise a
fully-convolutional generative 3D CNN model whose filter kernels are invariant
to the overall scene size. The model can be trained on scene subvolumes but
deployed on arbitrarily large scenes at test time. In addition, we propose a
coarse-to-fine inference strategy in order to produce high-resolution output
while also leveraging large input context sizes. In an extensive series of
experiments, we carefully evaluate different model design choices, considering
both deterministic and probabilistic models for completion and semantic
inference. Our results show that we outperform other methods not only in the
size of the environments handled and processing efficiency, but also with
regard to completion quality and semantic segmentation performance by a
significant margin.Comment: Video: https://youtu.be/5s5s8iH0NF
GRASS: Generative Recursive Autoencoders for Shape Structures
We introduce a novel neural network architecture for encoding and synthesis
of 3D shapes, particularly their structures. Our key insight is that 3D shapes
are effectively characterized by their hierarchical organization of parts,
which reflects fundamental intra-shape relationships such as adjacency and
symmetry. We develop a recursive neural net (RvNN) based autoencoder to map a
flat, unlabeled, arbitrary part layout to a compact code. The code effectively
captures hierarchical structures of man-made 3D objects of varying structural
complexities despite being fixed-dimensional: an associated decoder maps a code
back to a full hierarchy. The learned bidirectional mapping is further tuned
using an adversarial setup to yield a generative model of plausible structures,
from which novel structures can be sampled. Finally, our structure synthesis
framework is augmented by a second trained module that produces fine-grained
part geometry, conditioned on global and local structural context, leading to a
full generative pipeline for 3D shapes. We demonstrate that without
supervision, our network learns meaningful structural hierarchies adhering to
perceptual grouping principles, produces compact codes which enable
applications such as shape classification and partial matching, and supports
shape synthesis and interpolation with significant variations in topology and
geometry.Comment: Corresponding author: Kai Xu ([email protected]
Backwards is the way forward: feedback in the cortical hierarchy predicts the expected future
Clark offers a powerful description of the brain as a prediction machine, which offers progress on two distinct levels. First, on an abstract conceptual level, it provides a unifying framework for perception, action, and cognition (including subdivisions such as attention, expectation, and imagination). Second, hierarchical prediction offers progress on a concrete descriptive level for testing and constraining conceptual elements and mechanisms of predictive coding models (estimation of predictions, prediction errors, and internal models)
Predictive coding: A Possible Explanation of Filling-in at the blind spot
Filling-in at the blind-spot is a perceptual phenomenon in which the visual
system fills the informational void, which arises due to the absence of retinal
input corresponding to the optic disc, with surrounding visual attributes.
Though there are enough evidence to conclude that some kind of neural
computation is involved in filling-in at the blind spot especially in the early
visual cortex, the knowledge of the actual computational mechanism is far from
complete. We have investigated the bar experiments and the associated
filling-in phenomenon in the light of the hierarchical predictive coding
framework, where the blind-spot was represented by the absence of early
feed-forward connection. We recorded the responses of predictive estimator
neurons at the blind-spot region in the V1 area of our three level (LGN-V1-V2)
model network. These responses are in agreement with the results of earlier
physiological studies and using the generative model we also showed that these
response profiles indeed represent the filling-in completion. These demonstrate
that predictive coding framework could account for the filling-in phenomena
observed in several psychophysical and physiological experiments involving bar
stimuli. These results suggest that the filling-in could naturally arise from
the computational principle of hierarchical predictive coding (HPC) of natural
images.Comment: 23 pages, 9 figure
- …