1,689 research outputs found
Semi-Supervised First-Person Activity Recognition in Body-Worn Video
Body-worn cameras are now commonly used for logging daily life, sports, and
law enforcement activities, creating a large volume of archived footage. This
paper studies the problem of classifying frames of footage according to the
activity of the camera-wearer with an emphasis on application to real-world
police body-worn video. Real-world datasets pose a different set of challenges
from existing egocentric vision datasets: the amount of footage of different
activities is unbalanced, the data contains personally identifiable
information, and in practice it is difficult to provide substantial training
footage for a supervised approach. We address these challenges by extracting
features based exclusively on motion information then segmenting the video
footage using a semi-supervised classification algorithm. On publicly available
datasets, our method achieves results comparable to, if not better than,
supervised and/or deep learning methods using a fraction of the training data.
It also shows promising results on real-world police body-worn video
A Review of Audio Features and Statistical Models Exploited for Voice Pattern Design
Audio fingerprinting, also named as audio hashing, has been well-known as a
powerful technique to perform audio identification and synchronization. It
basically involves two major steps: fingerprint (voice pattern) design and
matching search. While the first step concerns the derivation of a robust and
compact audio signature, the second step usually requires knowledge about
database and quick-search algorithms. Though this technique offers a wide range
of real-world applications, to the best of the authors' knowledge, a
comprehensive survey of existing algorithms appeared more than eight years ago.
Thus, in this paper, we present a more up-to-date review and, for emphasizing
on the audio signal processing aspect, we focus our state-of-the-art survey on
the fingerprint design step for which various audio features and their
tractable statistical models are discussed.Comment: http://www.iaria.org/conferences2015/PATTERNS15.html ; Seventh
International Conferences on Pervasive Patterns and Applications (PATTERNS
2015), Mar 2015, Nice, Franc
Matching Natural Language Sentences with Hierarchical Sentence Factorization
Semantic matching of natural language sentences or identifying the
relationship between two sentences is a core research problem underlying many
natural language tasks. Depending on whether training data is available, prior
research has proposed both unsupervised distance-based schemes and supervised
deep learning schemes for sentence matching. However, previous approaches
either omit or fail to fully utilize the ordered, hierarchical, and flexible
structures of language objects, as well as the interactions between them. In
this paper, we propose Hierarchical Sentence Factorization---a technique to
factorize a sentence into a hierarchical representation, with the components at
each different scale reordered into a "predicate-argument" form. The proposed
sentence factorization technique leads to the invention of: 1) a new
unsupervised distance metric which calculates the semantic distance between a
pair of text snippets by solving a penalized optimal transport problem while
preserving the logical relationship of words in the reordered sentences, and 2)
new multi-scale deep learning models for supervised semantic training, based on
factorized sentence hierarchies. We apply our techniques to text-pair
similarity estimation and text-pair relationship classification tasks, based on
multiple datasets such as STSbenchmark, the Microsoft Research paraphrase
identification (MSRP) dataset, the SICK dataset, etc. Extensive experiments
show that the proposed hierarchical sentence factorization can be used to
significantly improve the performance of existing unsupervised distance-based
metrics as well as multiple supervised deep learning models based on the
convolutional neural network (CNN) and long short-term memory (LSTM).Comment: Accepted by WWW 2018, 10 page
Robust Principal Component Analysis on Graphs
Principal Component Analysis (PCA) is the most widely used tool for linear
dimensionality reduction and clustering. Still it is highly sensitive to
outliers and does not scale well with respect to the number of data samples.
Robust PCA solves the first issue with a sparse penalty term. The second issue
can be handled with the matrix factorization model, which is however
non-convex. Besides, PCA based clustering can also be enhanced by using a graph
of data similarity. In this article, we introduce a new model called "Robust
PCA on Graphs" which incorporates spectral graph regularization into the Robust
PCA framework. Our proposed model benefits from 1) the robustness of principal
components to occlusions and missing values, 2) enhanced low-rank recovery, 3)
improved clustering property due to the graph smoothness assumption on the
low-rank matrix, and 4) convexity of the resulting optimization problem.
Extensive experiments on 8 benchmark, 3 video and 2 artificial datasets with
corruptions clearly reveal that our model outperforms 10 other state-of-the-art
models in its clustering and low-rank recovery tasks
- …