2,073 research outputs found
Log-Euclidean Bag of Words for Human Action Recognition
Representing videos by densely extracted local space-time features has
recently become a popular approach for analysing actions. In this paper, we
tackle the problem of categorising human actions by devising Bag of Words (BoW)
models based on covariance matrices of spatio-temporal features, with the
features formed from histograms of optical flow. Since covariance matrices form
a special type of Riemannian manifold, the space of Symmetric Positive Definite
(SPD) matrices, non-Euclidean geometry should be taken into account while
discriminating between covariance matrices. To this end, we propose to embed
SPD manifolds to Euclidean spaces via a diffeomorphism and extend the BoW
approach to its Riemannian version. The proposed BoW approach takes into
account the manifold geometry of SPD matrices during the generation of the
codebook and histograms. Experiments on challenging human action datasets show
that the proposed method obtains notable improvements in discrimination
accuracy, in comparison to several state-of-the-art methods
Geometric deep learning: going beyond Euclidean data
Many scientific fields study data with an underlying structure that is a
non-Euclidean space. Some examples include social networks in computational
social sciences, sensor networks in communications, functional networks in
brain imaging, regulatory networks in genetics, and meshed surfaces in computer
graphics. In many applications, such geometric data are large and complex (in
the case of social networks, on the scale of billions), and are natural targets
for machine learning techniques. In particular, we would like to use deep
neural networks, which have recently proven to be powerful tools for a broad
range of problems from computer vision, natural language processing, and audio
analysis. However, these tools have been most successful on data with an
underlying Euclidean or grid-like structure, and in cases where the invariances
of these structures are built into networks used to model them. Geometric deep
learning is an umbrella term for emerging techniques attempting to generalize
(structured) deep neural models to non-Euclidean domains such as graphs and
manifolds. The purpose of this paper is to overview different examples of
geometric deep learning problems and present available solutions, key
difficulties, applications, and future research directions in this nascent
field
Geometric deep learning
The goal of these course notes is to describe the main mathematical ideas behind geometric deep learning and to provide implementation details for several applications in shape analysis and synthesis, computer vision and computer graphics. The text in the course materials is primarily based on previously published work. With these notes we gather and provide a clear picture of the key concepts and techniques that fall under the umbrella of geometric deep learning, and illustrate the applications they enable. We also aim to provide practical implementation details for the methods presented in these works, as well as suggest further readings and extensions of these ideas
Learning multi-view neighborhood preserving projections
We address the problem of metric learning for multi-view data, namely the construction of embedding projections from data in different representations into a shared feature space, such that the Euclidean distance in this space provides a meaningful within-view as well as between-view similarity. Our motivation stems from the problem of cross-media retrieval tasks, where the availability of a joint Euclidean distance function is a prerequisite to allow fast, in particular hashing-based, nearest neighbor queries. We formulate an objective function that expresses the intuitive concept that matching samples are mapped closely together in the output space, whereas non-matching samples are pushed apart, no matter in which view they are available. The resulting optimization problem is not convex, but it can be decomposed explicitly into a convex and a concave part, thereby allowing efficient optimization using the convex-concave procedure. Experiments on an image retrieval task show that nearest-neighbor based cross-view retrieval is indeed possible, and the proposed technique improves the retrieval accuracy over baseline techniques
Fast -NNG construction with GPU-based quick multi-select
In this paper we describe a new brute force algorithm for building the
-Nearest Neighbor Graph (-NNG). The -NNG algorithm has many
applications in areas such as machine learning, bio-informatics, and clustering
analysis. While there are very efficient algorithms for data of low dimensions,
for high dimensional data the brute force search is the best algorithm. There
are two main parts to the algorithm: the first part is finding the distances
between the input vectors which may be formulated as a matrix multiplication
problem. The second is the selection of the -NNs for each of the query
vectors. For the second part, we describe a novel graphics processing unit
(GPU) -based multi-select algorithm based on quick sort. Our optimization makes
clever use of warp voting functions available on the latest GPUs along with
use-controlled cache. Benchmarks show significant improvement over
state-of-the-art implementations of the -NN search on GPUs
- …