36,437 research outputs found
Histogram of gradients of Time-Frequency Representations for Audio scene detection
This paper addresses the problem of audio scenes classification and
contributes to the state of the art by proposing a novel feature. We build this
feature by considering histogram of gradients (HOG) of time-frequency
representation of an audio scene. Contrarily to classical audio features like
MFCC, we make the hypothesis that histogram of gradients are able to encode
some relevant informations in a time-frequency {representation:} namely, the
local direction of variation (in time and frequency) of the signal spectral
power. In addition, in order to gain more invariance and robustness, histogram
of gradients are locally pooled. We have evaluated the relevance of {the novel
feature} by comparing its performances with state-of-the-art competitors, on
several datasets, including a novel one that we provide, as part of our
contribution. This dataset, that we make publicly available, involves
classes and contains about minutes of audio scene recording. We thus
believe that it may be the next standard dataset for evaluating audio scene
classification algorithms. Our comparison results clearly show that our
HOG-based features outperform its competitor
Reconstructive Sparse Code Transfer for Contour Detection and Semantic Labeling
We frame the task of predicting a semantic labeling as a sparse
reconstruction procedure that applies a target-specific learned transfer
function to a generic deep sparse code representation of an image. This
strategy partitions training into two distinct stages. First, in an
unsupervised manner, we learn a set of generic dictionaries optimized for
sparse coding of image patches. We train a multilayer representation via
recursive sparse dictionary learning on pooled codes output by earlier layers.
Second, we encode all training images with the generic dictionaries and learn a
transfer function that optimizes reconstruction of patches extracted from
annotated ground-truth given the sparse codes of their corresponding image
patches. At test time, we encode a novel image using the generic dictionaries
and then reconstruct using the transfer function. The output reconstruction is
a semantic labeling of the test image.
Applying this strategy to the task of contour detection, we demonstrate
performance competitive with state-of-the-art systems. Unlike almost all prior
work, our approach obviates the need for any form of hand-designed features or
filters. To illustrate general applicability, we also show initial results on
semantic part labeling of human faces.
The effectiveness of our approach opens new avenues for research on deep
sparse representations. Our classifiers utilize this representation in a novel
manner. Rather than acting on nodes in the deepest layer, they attach to nodes
along a slice through multiple layers of the network in order to make
predictions about local patches. Our flexible combination of a generatively
learned sparse representation with discriminatively trained transfer
classifiers extends the notion of sparse reconstruction to encompass arbitrary
semantic labeling tasks.Comment: to appear in Asian Conference on Computer Vision (ACCV), 201
Morphing Ensemble Kalman Filters
A new type of ensemble filter is proposed, which combines an ensemble Kalman
filter (EnKF) with the ideas of morphing and registration from image
processing. This results in filters suitable for nonlinear problems whose
solutions exhibit moving coherent features, such as thin interfaces in wildfire
modeling. The ensemble members are represented as the composition of one common
state with a spatial transformation, called registration mapping, plus a
residual. A fully automatic registration method is used that requires only
gridded data, so the features in the model state do not need to be identified
by the user. The morphing EnKF operates on a transformed state consisting of
the registration mapping and the residual. Essentially, the morphing EnKF uses
intermediate states obtained by morphing instead of linear combinations of the
states.Comment: 17 pages, 7 figures. Added DDDAS references to the introductio
- …