39,629 research outputs found
A quick search method for audio signals based on a piecewise linear representation of feature trajectories
This paper presents a new method for a quick similarity-based search through
long unlabeled audio streams to detect and locate audio clips provided by
users. The method involves feature-dimension reduction based on a piecewise
linear representation of a sequential feature trajectory extracted from a long
audio stream. Two techniques enable us to obtain a piecewise linear
representation: the dynamic segmentation of feature trajectories and the
segment-based Karhunen-L\'{o}eve (KL) transform. The proposed search method
guarantees the same search results as the search method without the proposed
feature-dimension reduction method in principle. Experiment results indicate
significant improvements in search speed. For example the proposed method
reduced the total search time to approximately 1/12 that of previous methods
and detected queries in approximately 0.3 seconds from a 200-hour audio
database.Comment: 20 pages, to appear in IEEE Transactions on Audio, Speech and
Language Processin
Segmentation and semantic labelling of RGBD data with convolutional neural networks and surface fitting
We present an approach for segmentation and semantic labelling of RGBD data exploiting together geometrical cues and deep learning techniques. An initial over-segmentation is performed using spectral clustering and a set of non-uniform rational B-spline surfaces is fitted on the extracted segments. Then a convolutional neural network (CNN) receives in input colour and geometry data together with surface fitting parameters. The network is made of nine convolutional stages followed by a softmax classifier and produces a vector of descriptors for each sample. In the next step, an iterative merging algorithm recombines the output of the over-segmentation into larger regions matching the various elements of the scene. The couples of adjacent segments with higher similarity according to the CNN features are candidate to be merged and the surface fitting accuracy is used to detect which couples of segments belong to the same surface. Finally, a set of labelled segments is obtained by combining the segmentation output with the descriptors from the CNN. Experimental results show how the proposed approach outperforms state-of-the-art methods and provides an accurate segmentation and labelling
Deep Learning for Vanishing Point Detection Using an Inverse Gnomonic Projection
We present a novel approach for vanishing point detection from uncalibrated
monocular images. In contrast to state-of-the-art, we make no a priori
assumptions about the observed scene. Our method is based on a convolutional
neural network (CNN) which does not use natural images, but a Gaussian sphere
representation arising from an inverse gnomonic projection of lines detected in
an image. This allows us to rely on synthetic data for training, eliminating
the need for labelled images. Our method achieves competitive performance on
three horizon estimation benchmark datasets. We further highlight some
additional use cases for which our vanishing point detection algorithm can be
used.Comment: Accepted for publication at German Conference on Pattern Recognition
(GCPR) 2017. This research was supported by German Research Foundation DFG
within Priority Research Programme 1894 "Volunteered Geographic Information:
Interpretation, Visualisation and Social Computing
Component-wise modeling of articulated objects
We introduce a novel framework for modeling articulated objects based on the aspects of their components. By decomposing the object into components, we divide the problem in smaller modeling tasks. After obtaining 3D models for each component aspect by employing a shape deformation paradigm, we merge them together, forming the object components. The final model is obtained by assembling the components using an optimization scheme which fits the respective 3D models to the corresponding apparent contours in a reference pose. The results suggest that our approach can produce realistic 3D models of articulated objects in reasonable time
- …