2,388 research outputs found
Geometric deep learning: going beyond Euclidean data
Many scientific fields study data with an underlying structure that is a
non-Euclidean space. Some examples include social networks in computational
social sciences, sensor networks in communications, functional networks in
brain imaging, regulatory networks in genetics, and meshed surfaces in computer
graphics. In many applications, such geometric data are large and complex (in
the case of social networks, on the scale of billions), and are natural targets
for machine learning techniques. In particular, we would like to use deep
neural networks, which have recently proven to be powerful tools for a broad
range of problems from computer vision, natural language processing, and audio
analysis. However, these tools have been most successful on data with an
underlying Euclidean or grid-like structure, and in cases where the invariances
of these structures are built into networks used to model them. Geometric deep
learning is an umbrella term for emerging techniques attempting to generalize
(structured) deep neural models to non-Euclidean domains such as graphs and
manifolds. The purpose of this paper is to overview different examples of
geometric deep learning problems and present available solutions, key
difficulties, applications, and future research directions in this nascent
field
Spotlight the Negatives: A Generalized Discriminative Latent Model
Discriminative latent variable models (LVM) are frequently applied to various
visual recognition tasks. In these systems the latent (hidden) variables
provide a formalism for modeling structured variation of visual features.
Conventionally, latent variables are de- fined on the variation of the
foreground (positive) class. In this work we augment LVMs to include negative
latent variables corresponding to the background class. We formalize the
scoring function of such a generalized LVM (GLVM). Then we discuss a framework
for learning a model based on the GLVM scoring function. We theoretically
showcase how some of the current visual recognition methods can benefit from
this generalization. Finally, we experiment on a generalized form of Deformable
Part Models with negative latent variables and show significant improvements on
two different detection tasks.Comment: Published in proceedings of BMVC 201
CNN Features off-the-shelf: an Astounding Baseline for Recognition
Recent results indicate that the generic descriptors extracted from the
convolutional neural networks are very powerful. This paper adds to the
mounting evidence that this is indeed the case. We report on a series of
experiments conducted for different recognition tasks using the publicly
available code and model of the \overfeat network which was trained to perform
object classification on ILSVRC13. We use features extracted from the \overfeat
network as a generic image representation to tackle the diverse range of
recognition tasks of object image classification, scene recognition, fine
grained recognition, attribute detection and image retrieval applied to a
diverse set of datasets. We selected these tasks and datasets as they gradually
move further away from the original task and data the \overfeat network was
trained to solve. Astonishingly, we report consistent superior results compared
to the highly tuned state-of-the-art systems in all the visual classification
tasks on various datasets. For instance retrieval it consistently outperforms
low memory footprint methods except for sculptures dataset. The results are
achieved using a linear SVM classifier (or distance in case of retrieval)
applied to a feature representation of size 4096 extracted from a layer in the
net. The representations are further modified using simple augmentation
techniques e.g. jittering. The results strongly suggest that features obtained
from deep learning with convolutional nets should be the primary candidate in
most visual recognition tasks.Comment: version 3 revisions: 1)Added results using feature processing and
data augmentation 2)Referring to most recent efforts of using CNN for
different visual recognition tasks 3) updated text/captio
Object Detection using Dimensionality Reduction on Image Descriptors
The aim of object detection is to recognize objects in a visual scene. Performing reliable object detection is becoming increasingly important in the fields of computer vision and robotics. Various applications of object detection include video surveillance, traffic monitoring, digital libraries, navigation, human computer interaction, etc. The challenges involved with detecting real world objects include the multitude of colors, textures, sizes, and cluttered or complex backgrounds making objects difficult to detect.
This thesis contributes to the exploration of various dimensionality reduction techniques on descriptors for establishing an object detection system that achieves the best trade-offs between performance and speed. Histogram of Oriented Gradients (HOG) and other histogram-based descriptors were used as an input to a Support Vector Machine (SVM) classifier to achieve good classification performance. Binary descriptors were considered as a computationally efficient alternative to HOG. It was determined that single local binary descriptors in combination with Support Vector Machine (SVM) classifier don\u27t work as well as histograms of features for object detection. Thus, histogram of binary descriptors features were explored as a viable alternative and the results were found to be comparable to those of the popular Histogram of Oriented Gradients descriptor.
Histogram-based descriptors can be high dimensional and working with large amounts of data can be computationally expensive and slow. Thus, various dimensionality reduction techniques were considered, such as principal component analysis (PCA), which is the most widely used technique, random projections, which is data independent and fast to compute, unsupervised locality preserving projections (LPP), and supervised locality preserving projections (SLPP), which incorporate non-linear reduction techniques.
The classification system was tested on eye detection as well as different object classes. The eye database was created using BioID and FERET databases. Additionally, the CalTech-101 data set, which has 101 object categories, was used to evaluate the system. The results showed that the reduced-dimensionality descriptors based on SLPP gave improved classification performance with fewer computations
Learning the dynamics and time-recursive boundary detection of deformable objects
We propose a principled framework for recursively segmenting deformable objects across a sequence
of frames. We demonstrate the usefulness of this method on left ventricular segmentation across a cardiac
cycle. The approach involves a technique for learning the system dynamics together with methods of
particle-based smoothing as well as non-parametric belief propagation on a loopy graphical model capturing
the temporal periodicity of the heart. The dynamic system state is a low-dimensional representation
of the boundary, and the boundary estimation involves incorporating curve evolution into recursive state
estimation. By formulating the problem as one of state estimation, the segmentation at each particular
time is based not only on the data observed at that instant, but also on predictions based on past and future
boundary estimates. Although the paper focuses on left ventricle segmentation, the method generalizes
to temporally segmenting any deformable object
- …