51,457 research outputs found
Music ranking techniques evaluated
In a music retrieval system, a user presents a piece of music as a query and the system must identify from a corpus of performances other pieces with a similar melody. Several techniques have been proposed for matching such queries to stored music. In previous work, we found that local alignment, a technique derived from bioinformatics, was more effective than the n-gram methods derived from information retrieval; other researchers have reported success with n-grams, but have not compared against local alignment. In this paper we explore a broader range of n-gram techniques, and test them with both manual queries and queries automatically extracted from MIDI files. Our experiments show that n-gram matching techniques can be as effective as local alignment; one highly effective technique is to simply count the number of n-grams in common between the query and the stored piece of music. N-grams are particularly effective for short queries and manual queries, while local alignment is superior for automatic queries
How Many Topics? Stability Analysis for Topic Models
Topic modeling refers to the task of discovering the underlying thematic
structure in a text corpus, where the output is commonly presented as a report
of the top terms appearing in each topic. Despite the diversity of topic
modeling algorithms that have been proposed, a common challenge in successfully
applying these techniques is the selection of an appropriate number of topics
for a given corpus. Choosing too few topics will produce results that are
overly broad, while choosing too many will result in the "over-clustering" of a
corpus into many small, highly-similar topics. In this paper, we propose a
term-centric stability analysis strategy to address this issue, the idea being
that a model with an appropriate number of topics will be more robust to
perturbations in the data. Using a topic modeling approach based on matrix
factorization, evaluations performed on a range of corpora show that this
strategy can successfully guide the model selection process.Comment: Improve readability of plots. Add minor clarification
An integrated ranking algorithm for efficient information computing in social networks
Social networks have ensured the expanding disproportion between the face of
WWW stored traditionally in search engine repositories and the actual ever
changing face of Web. Exponential growth of web users and the ease with which
they can upload contents on web highlights the need of content controls on
material published on the web. As definition of search is changing,
socially-enhanced interactive search methodologies are the need of the hour.
Ranking is pivotal for efficient web search as the search performance mainly
depends upon the ranking results. In this paper new integrated ranking model
based on fused rank of web object based on popularity factor earned over only
valid interlinks from multiple social forums is proposed. This model identifies
relationships between web objects in separate social networks based on the
object inheritance graph. Experimental study indicates the effectiveness of
proposed Fusion based ranking algorithm in terms of better search results.Comment: 14 pages, International Journal on Web Service Computing (IJWSC),
Vol.3, No.1, March 201
Sequential Complexity as a Descriptor for Musical Similarity
We propose string compressibility as a descriptor of temporal structure in
audio, for the purpose of determining musical similarity. Our descriptors are
based on computing track-wise compression rates of quantised audio features,
using multiple temporal resolutions and quantisation granularities. To verify
that our descriptors capture musically relevant information, we incorporate our
descriptors into similarity rating prediction and song year prediction tasks.
We base our evaluation on a dataset of 15500 track excerpts of Western popular
music, for which we obtain 7800 web-sourced pairwise similarity ratings. To
assess the agreement among similarity ratings, we perform an evaluation under
controlled conditions, obtaining a rank correlation of 0.33 between intersected
sets of ratings. Combined with bag-of-features descriptors, we obtain
performance gains of 31.1% and 10.9% for similarity rating prediction and song
year prediction. For both tasks, analysis of selected descriptors reveals that
representing features at multiple time scales benefits prediction accuracy.Comment: 13 pages, 9 figures, 8 tables. Accepted versio
- …