6,452 research outputs found
Histogram of gradients of Time-Frequency Representations for Audio scene detection
This paper addresses the problem of audio scenes classification and
contributes to the state of the art by proposing a novel feature. We build this
feature by considering histogram of gradients (HOG) of time-frequency
representation of an audio scene. Contrarily to classical audio features like
MFCC, we make the hypothesis that histogram of gradients are able to encode
some relevant informations in a time-frequency {representation:} namely, the
local direction of variation (in time and frequency) of the signal spectral
power. In addition, in order to gain more invariance and robustness, histogram
of gradients are locally pooled. We have evaluated the relevance of {the novel
feature} by comparing its performances with state-of-the-art competitors, on
several datasets, including a novel one that we provide, as part of our
contribution. This dataset, that we make publicly available, involves
classes and contains about minutes of audio scene recording. We thus
believe that it may be the next standard dataset for evaluating audio scene
classification algorithms. Our comparison results clearly show that our
HOG-based features outperform its competitor
Perturbation of the Eigenvectors of the Graph Laplacian: Application to Image Denoising
The original contributions of this paper are twofold: a new understanding of
the influence of noise on the eigenvectors of the graph Laplacian of a set of
image patches, and an algorithm to estimate a denoised set of patches from a
noisy image. The algorithm relies on the following two observations: (1) the
low-index eigenvectors of the diffusion, or graph Laplacian, operators are very
robust to random perturbations of the weights and random changes in the
connections of the patch-graph; and (2) patches extracted from smooth regions
of the image are organized along smooth low-dimensional structures in the
patch-set, and therefore can be reconstructed with few eigenvectors.
Experiments demonstrate that our denoising algorithm outperforms the denoising
gold-standards
Rotation-invariant features for multi-oriented text detection in natural images.
Texts in natural scenes carry rich semantic information, which can be used to assist a wide range of applications, such as object recognition, image/video retrieval, mapping/navigation, and human computer interaction. However, most existing systems are designed to detect and recognize horizontal (or near-horizontal) texts. Due to the increasing popularity of mobile-computing devices and applications, detecting texts of varying orientations from natural images under less controlled conditions has become an important but challenging task. In this paper, we propose a new algorithm to detect texts of varying orientations. Our algorithm is based on a two-level classification scheme and two sets of features specially designed for capturing the intrinsic characteristics of texts. To better evaluate the proposed method and compare it with the competing algorithms, we generate a comprehensive dataset with various types of texts in diverse real-world scenes. We also propose a new evaluation protocol, which is more suitable for benchmarking algorithms for detecting texts in varying orientations. Experiments on benchmark datasets demonstrate that our system compares favorably with the state-of-the-art algorithms when handling horizontal texts and achieves significantly enhanced performance on variant texts in complex natural scenes
Shape from periodic texture using the eigenvectors of local affine distortion
This paper shows how the local slant and tilt angles of regularly textured curved surfaces can be estimated directly, without the need for iterative numerical optimization, We work in the frequency domain and measure texture distortion using the affine distortion of the pattern of spectral peaks. The key theoretical contribution is to show that the directions of the eigenvectors of the affine distortion matrices can be used to estimate local slant and tilt angles of tangent planes to curved surfaces. In particular, the leading eigenvector points in the tilt direction. Although not as geometrically transparent, the direction of the second eigenvector can be used to estimate the slant direction. The required affine distortion matrices are computed using the correspondences between spectral peaks, established on the basis of their energy ordering. We apply the method to a variety of real-world and synthetic imagery
AGE ESTIMATION USING NEURAL NETWORKS BASED ON FACE IMAGES WITH STUDY OF DIFFERENT FEATURE EXTRACTION METHODS
 Facial age estimation recently becomes active research topic in pattern recognition. As there are vast potential application in age specific human computer interaction security control and surveillance monitoring. Insufficient and incomplete training data, uncontrollable environment, facial expression are the most prominent challenges in facial age estimation. Degree of accuracy for age estimation is obtained by forming appropriate feature vector of a facial image. Feature vectors are constructed from facial features. Therefore comparative study of feature extraction from facial image by bio inspired feature (BIF), histogram of gradient (HOG), Gabor filter, wavelet transform and scattering transform is done. The propose approach exploits scattering transform gives more information about features of the facial images. Well organized system consist scattering transform that disperse gabber coefficients pulling with smooth gaussian process in number of layers which isused to calculate for facial feature representation. These extracted features are classified using support vector machine and artificial neural network
- …