4,311 research outputs found
Non-negative matrix factorization with sparseness constraints
Non-negative matrix factorization (NMF) is a recently developed technique for
finding parts-based, linear representations of non-negative data. Although it
has successfully been applied in several applications, it does not always
result in parts-based representations. In this paper, we show how explicitly
incorporating the notion of `sparseness' improves the found decompositions.
Additionally, we provide complete MATLAB code both for standard NMF and for our
extension. Our hope is that this will further the application of these methods
to solving novel data-analysis problems
Modelling of Sound Events with Hidden Imbalances Based on Clustering and Separate Sub-Dictionary Learning
This paper proposes an effective modelling of sound event spectra with a
hidden data-size-imbalance, for improved Acoustic Event Detection (AED). The
proposed method models each event as an aggregated representation of a few
latent factors, while conventional approaches try to find acoustic elements
directly from the event spectra. In the method, all the latent factors across
all events are assigned comparable importance and complexity to overcome the
hidden imbalance of data-sizes in event spectra. To extract latent factors in
each event, the proposed method employs clustering and performs non-negative
matrix factorization to each latent factor, and learns its acoustic elements as
a sub-dictionary. Separate sub-dictionary learning effectively models the
acoustic elements with limited data-sizes and avoids over-fitting due to hidden
imbalances in training data. For the task of polyphonic sound event detection
from DCASE 2013 challenge, an AED based on the proposed modelling achieves a
detection F-measure of 46.5%, a significant improvement of more than 19% as
compared to the existing state-of-the-art methods
Structure from Articulated Motion: Accurate and Stable Monocular 3D Reconstruction without Training Data
Recovery of articulated 3D structure from 2D observations is a challenging
computer vision problem with many applications. Current learning-based
approaches achieve state-of-the-art accuracy on public benchmarks but are
restricted to specific types of objects and motions covered by the training
datasets. Model-based approaches do not rely on training data but show lower
accuracy on these datasets. In this paper, we introduce a model-based method
called Structure from Articulated Motion (SfAM), which can recover multiple
object and motion types without training on extensive data collections. At the
same time, it performs on par with learning-based state-of-the-art approaches
on public benchmarks and outperforms previous non-rigid structure from motion
(NRSfM) methods. SfAM is built upon a general-purpose NRSfM technique while
integrating a soft spatio-temporal constraint on the bone lengths. We use
alternating optimization strategy to recover optimal geometry (i.e., bone
proportions) together with 3D joint positions by enforcing the bone lengths
consistency over a series of frames. SfAM is highly robust to noisy 2D
annotations, generalizes to arbitrary objects and does not rely on training
data, which is shown in extensive experiments on public benchmarks and real
video sequences. We believe that it brings a new perspective on the domain of
monocular 3D recovery of articulated structures, including human motion
capture.Comment: 21 pages, 8 figures, 2 table
Structure Preserving Large Imagery Reconstruction
With the explosive growth of web-based cameras and mobile devices, billions
of photographs are uploaded to the internet. We can trivially collect a huge
number of photo streams for various goals, such as image clustering, 3D scene
reconstruction, and other big data applications. However, such tasks are not
easy due to the fact the retrieved photos can have large variations in their
view perspectives, resolutions, lighting, noises, and distortions.
Fur-thermore, with the occlusion of unexpected objects like people, vehicles,
it is even more challenging to find feature correspondences and reconstruct
re-alistic scenes. In this paper, we propose a structure-based image completion
algorithm for object removal that produces visually plausible content with
consistent structure and scene texture. We use an edge matching technique to
infer the potential structure of the unknown region. Driven by the estimated
structure, texture synthesis is performed automatically along the estimated
curves. We evaluate the proposed method on different types of images: from
highly structured indoor environment to natural scenes. Our experimental
results demonstrate satisfactory performance that can be potentially used for
subsequent big data processing, such as image localization, object retrieval,
and scene reconstruction. Our experiments show that this approach achieves
favorable results that outperform existing state-of-the-art techniques
- …