8,251 research outputs found
Video Data Visualization System: Semantic Classification And Personalization
We present in this paper an intelligent video data visualization tool, based
on semantic classification, for retrieving and exploring a large scale corpus
of videos. Our work is based on semantic classification resulting from semantic
analysis of video. The obtained classes will be projected in the visualization
space. The graph is represented by nodes and edges, the nodes are the keyframes
of video documents and the edges are the relation between documents and the
classes of documents. Finally, we construct the user's profile, based on the
interaction with the system, to render the system more adequate to its
references.Comment: graphic
A Review of Audio Features and Statistical Models Exploited for Voice Pattern Design
Audio fingerprinting, also named as audio hashing, has been well-known as a
powerful technique to perform audio identification and synchronization. It
basically involves two major steps: fingerprint (voice pattern) design and
matching search. While the first step concerns the derivation of a robust and
compact audio signature, the second step usually requires knowledge about
database and quick-search algorithms. Though this technique offers a wide range
of real-world applications, to the best of the authors' knowledge, a
comprehensive survey of existing algorithms appeared more than eight years ago.
Thus, in this paper, we present a more up-to-date review and, for emphasizing
on the audio signal processing aspect, we focus our state-of-the-art survey on
the fingerprint design step for which various audio features and their
tractable statistical models are discussed.Comment: http://www.iaria.org/conferences2015/PATTERNS15.html ; Seventh
International Conferences on Pervasive Patterns and Applications (PATTERNS
2015), Mar 2015, Nice, Franc
On orthogonal projections for dimension reduction and applications in augmented target loss functions for learning problems
The use of orthogonal projections on high-dimensional input and target data
in learning frameworks is studied. First, we investigate the relations between
two standard objectives in dimension reduction, preservation of variance and of
pairwise relative distances. Investigations of their asymptotic correlation as
well as numerical experiments show that a projection does usually not satisfy
both objectives at once. In a standard classification problem we determine
projections on the input data that balance the objectives and compare
subsequent results. Next, we extend our application of orthogonal projections
to deep learning tasks and introduce a general framework of augmented target
loss functions. These loss functions integrate additional information via
transformations and projections of the target data. In two supervised learning
problems, clinical image segmentation and music information classification, the
application of our proposed augmented target loss functions increase the
accuracy
- …