36,232 research outputs found
Recommended from our members
Factors in human recognition of timbre lexicons generated by data clustering
Since the development of sound recording technologies, the palette of sound timbres available for music creation was extended way beyond traditional musical instruments. The organization and categorization of timbre has been a common endeavor. The availability of large databases of sound clips provides an opportunity for obtaining datadriven timbre categorizations via content-based clustering. In this article we describe an experiment aimed at understanding what factors influence the process of learning a given clustering of sound samples. We clustered a large database of short sound clips, and analyzed the success of participants in assigning sounds to the “correct” clusters after listening to a few examples of each. The results of the experiment suggest a number of relevant factors related both to the strategies followed by users and to the quality measures of the clustering solution, which can guide the design of creative applications based on audio clip clustering
Deep Clustering and Conventional Networks for Music Separation: Stronger Together
Deep clustering is the first method to handle general audio separation
scenarios with multiple sources of the same type and an arbitrary number of
sources, performing impressively in speaker-independent speech separation
tasks. However, little is known about its effectiveness in other challenging
situations such as music source separation. Contrary to conventional networks
that directly estimate the source signals, deep clustering generates an
embedding for each time-frequency bin, and separates sources by clustering the
bins in the embedding space. We show that deep clustering outperforms
conventional networks on a singing voice separation task, in both matched and
mismatched conditions, even though conventional networks have the advantage of
end-to-end training for best signal approximation, presumably because its more
flexible objective engenders better regularization. Since the strengths of deep
clustering and conventional network architectures appear complementary, we
explore combining them in a single hybrid network trained via an approach akin
to multi-task learning. Remarkably, the combination significantly outperforms
either of its components.Comment: Published in ICASSP 201
Graph-RAT: Combining data sources in music recommendation systems
The complexity of music recommendation systems has increased rapidly in recent years, drawing upon different sources of information: content analysis, web-mining, social tagging, etc. Unfortunately, the tools to scientifically evaluate such integrated systems are not readily available; nor are the base algorithms available. This article describes Graph-RAT (Graph-based Relational Analysis Toolkit), an open source toolkit that provides a framework for developing and evaluating novel hybrid systems. While this toolkit is designed for music recommendation, it has applications outside its discipline as well. An experiment—indicative of the sort of procedure that can be configured using the toolkit—is provided to illustrate its usefulness
Microtiming patterns and interactions with musical properties in Samba music
In this study, we focus on the interaction between microtiming patterns and several musical properties: intensity, meter and spectral characteristics. The data-set of 106 musical audio excerpts is processed by means of an auditory model and then divided into several spectral regions and metric levels. The resulting segments are described in terms of their musical properties, over which patterns of peak positions and their intensities are sought. A clustering algorithm is used to systematize the process of pattern detection. The results confirm previously reported anticipations of the third and fourth semiquavers in a beat. We also argue that these patterns of microtiming deviations interact with different profiles of intensities that change according to the metrical structure and spectral characteristics. In particular, we suggest two new findings: (i) a small delay of microtiming positions at the lower end of the spectrum on the first semiquaver of each beat and (ii) systematic forms of accelerando and ritardando at a microtiming level covering two-beat and four-beat phrases. The results demonstrate the importance of multidimensional interactions with timing aspects of music. However, more research is needed in order to find proper representations for rhythm and microtiming aspects in such contexts
A Review of Audio Features and Statistical Models Exploited for Voice Pattern Design
Audio fingerprinting, also named as audio hashing, has been well-known as a
powerful technique to perform audio identification and synchronization. It
basically involves two major steps: fingerprint (voice pattern) design and
matching search. While the first step concerns the derivation of a robust and
compact audio signature, the second step usually requires knowledge about
database and quick-search algorithms. Though this technique offers a wide range
of real-world applications, to the best of the authors' knowledge, a
comprehensive survey of existing algorithms appeared more than eight years ago.
Thus, in this paper, we present a more up-to-date review and, for emphasizing
on the audio signal processing aspect, we focus our state-of-the-art survey on
the fingerprint design step for which various audio features and their
tractable statistical models are discussed.Comment: http://www.iaria.org/conferences2015/PATTERNS15.html ; Seventh
International Conferences on Pervasive Patterns and Applications (PATTERNS
2015), Mar 2015, Nice, Franc
Diffusion map for clustering fMRI spatial maps extracted by independent component analysis
Functional magnetic resonance imaging (fMRI) produces data about activity
inside the brain, from which spatial maps can be extracted by independent
component analysis (ICA). In datasets, there are n spatial maps that contain p
voxels. The number of voxels is very high compared to the number of analyzed
spatial maps. Clustering of the spatial maps is usually based on correlation
matrices. This usually works well, although such a similarity matrix inherently
can explain only a certain amount of the total variance contained in the
high-dimensional data where n is relatively small but p is large. For
high-dimensional space, it is reasonable to perform dimensionality reduction
before clustering. In this research, we used the recently developed diffusion
map for dimensionality reduction in conjunction with spectral clustering. This
research revealed that the diffusion map based clustering worked as well as the
more traditional methods, and produced more compact clusters when needed.Comment: 6 pages. 8 figures. Copyright (c) 2013 IEEE. Published at 2013 IEEE
International Workshop on Machine Learning for Signal Processin
- …