847 research outputs found
Recommended from our members
The Variable Markov Oracle: Algorithms for Human Gesture Applications
This article introduces the Variable Markov Oracle (VMO) data structure for multivariate time series indexing. VMO can identify repetitive fragments and find sequential similarities between observations. VMO can also be viewed as a combination of online clustering algorithms with variable-order Markov constraints. The authors use VMO for gesture query-by-content and gesture following. A probabilistic interpretation of the VMO query-matching algorithm is proposed to find an analogy to the inference problem in a hidden Markov model (HMM). This probabilistic interpretation extends VMO to be not only a data structure but also a model for time series. Query-by-content experiments were conducted on a gesture database that was recorded using a Kinect 3D camera, showing state-of-the-art performance. The query-by-content experiments' results are compared to previous works using HMM and dynamic time warping. Gesture following is described in the context of an interactive dance environment that aims to integrate human movements with computer-generated graphics to create an augmented reality performance
A Comprehensive Trainable Error Model for Sung Music Queries
We propose a model for errors in sung queries, a variant of the hidden Markov
model (HMM). This is a solution to the problem of identifying the degree of
similarity between a (typically error-laden) sung query and a potential target
in a database of musical works, an important problem in the field of music
information retrieval. Similarity metrics are a critical component of
query-by-humming (QBH) applications which search audio and multimedia databases
for strong matches to oral queries. Our model comprehensively expresses the
types of error or variation between target and query: cumulative and
non-cumulative local errors, transposition, tempo and tempo changes,
insertions, deletions and modulation. The model is not only expressive, but
automatically trainable, or able to learn and generalize from query examples.
We present results of simulations, designed to assess the discriminatory
potential of the model, and tests with real sung queries, to demonstrate
relevance to real-world applications
On the Application of Generic Summarization Algorithms to Music
Several generic summarization algorithms were developed in the past and
successfully applied in fields such as text and speech summarization. In this
paper, we review and apply these algorithms to music. To evaluate this
summarization's performance, we adopt an extrinsic approach: we compare a Fado
Genre Classifier's performance using truncated contiguous clips against the
summaries extracted with those algorithms on 2 different datasets. We show that
Maximal Marginal Relevance (MMR), LexRank and Latent Semantic Analysis (LSA)
all improve classification performance in both datasets used for testing.Comment: 12 pages, 1 table; Submitted to IEEE Signal Processing Letter
Using Generic Summarization to Improve Music Information Retrieval Tasks
In order to satisfy processing time constraints, many MIR tasks process only
a segment of the whole music signal. This practice may lead to decreasing
performance, since the most important information for the tasks may not be in
those processed segments. In this paper, we leverage generic summarization
algorithms, previously applied to text and speech summarization, to summarize
items in music datasets. These algorithms build summaries, that are both
concise and diverse, by selecting appropriate segments from the input signal
which makes them good candidates to summarize music as well. We evaluate the
summarization process on binary and multiclass music genre classification
tasks, by comparing the performance obtained using summarized datasets against
the performances obtained using continuous segments (which is the traditional
method used for addressing the previously mentioned time constraints) and full
songs of the same original dataset. We show that GRASSHOPPER, LexRank, LSA,
MMR, and a Support Sets-based Centrality model improve classification
performance when compared to selected 30-second baselines. We also show that
summarized datasets lead to a classification performance whose difference is
not statistically significant from using full songs. Furthermore, we make an
argument stating the advantages of sharing summarized datasets for future MIR
research.Comment: 24 pages, 10 tables; Submitted to IEEE/ACM Transactions on Audio,
Speech and Language Processin
IDENTIFICATION OF COVER SONGS USING INFORMATION THEORETIC MEASURES OF SIMILARITY
13 pages, 5 figures, 4 tables. v3: Accepted version13 pages, 5 figures, 4 tables. v3: Accepted version13 pages, 5 figures, 4 tables. v3: Accepted versio
- …