363 research outputs found
Music Information Retrieval: An Inspirational Guide to Transfer from Related Disciplines
The emerging field of Music Information Retrieval (MIR) has been influenced by neighboring domains in signal processing and machine learning, including automatic speech recognition, image processing and text information retrieval. In this contribution, we start with concrete examples for methodology transfer between speech and music processing, oriented on the building blocks of pattern recognition: preprocessing, feature extraction, and classification/decoding. We then assume a higher level viewpoint when describing sources of mutual inspiration derived from text and image information retrieval. We conclude that dealing with the peculiarities of music in MIR research has contributed to advancing the state-of-the-art in other fields, and that many future challenges in MIR are strikingly similar to those that other research areas have been facing
Automatic music transcription: challenges and future directions
Automatic music transcription is considered by many to be a key enabling technology in music signal processing. However, the performance of transcription systems is still significantly below that of a human expert, and accuracies reported in recent years seem to have reached a limit, although the field is still very active. In this paper we analyse limitations of current methods and identify promising directions for future research. Current transcription methods use general purpose models which are unable to capture the rich diversity found in music signals. One way to overcome the limited performance of transcription systems is to tailor algorithms to specific use-cases. Semi-automatic approaches are another way of achieving a more reliable transcription. Also, the wealth of musical scores and corresponding audio data now available are a rich potential source of training data, via forced alignment of audio to scores, but large scale utilisation of such data has yet to be attempted. Other promising approaches include the integration of information from multiple algorithms and different musical aspects
Sequential decision making in artificial musical intelligence
Over the past 60 years, artificial intelligence has grown from a largely academic field of research to a ubiquitous array of tools and approaches used in everyday technology. Despite its many recent successes and growing prevalence, certain meaningful facets of computational intelligence have not been as thoroughly explored. Such additional facets cover a wide array of complex mental tasks which humans carry out easily, yet are difficult for computers to mimic. A prime example of a domain in which human intelligence thrives, but machine understanding is still fairly limited, is music. Over the last decade, many researchers have applied computational tools to carry out tasks such as genre identification, music summarization, music database querying, and melodic segmentation. While these are all useful algorithmic solutions, we are still a long way from constructing complete music agents, able to mimic (at least partially) the complexity with which humans approach music. One key aspect which hasn't been sufficiently studied is that of sequential decision making in musical intelligence. This thesis strives to answer the following question: Can a sequential decision making perspective guide us in the creation of better music agents, and social agents in general? And if so, how? More specifically, this thesis focuses on two aspects of musical intelligence: music recommendation and human-agent (and more generally agent-agent) interaction in the context of music. The key contributions of this thesis are the design of better music playlist recommendation algorithms; the design of algorithms for tracking user preferences over time; new approaches for modeling people's behavior in situations that involve music; and the design of agents capable of meaningful interaction with humans and other agents in a setting where music plays a roll (either directly or indirectly). Though motivated primarily by music-related tasks, and focusing largely on people's musical preferences, this thesis also establishes that insights from music-specific case studies can also be applicable in other concrete social domains, such as different types of content recommendation. Showing the generality of insights from musical data in other contexts serves as evidence for the utility of music domains as testbeds for the development of general artificial intelligence techniques. Ultimately, this thesis demonstrates the overall usefulness of taking a sequential decision making approach in settings previously unexplored from this perspectiveComputer Science
A Functional Taxonomy of Music Generation Systems
Digital advances have transformed the face of automatic music generation
since its beginnings at the dawn of computing. Despite the many breakthroughs,
issues such as the musical tasks targeted by different machines and the degree
to which they succeed remain open questions. We present a functional taxonomy
for music generation systems with reference to existing systems. The taxonomy
organizes systems according to the purposes for which they were designed. It
also reveals the inter-relatedness amongst the systems. This design-centered
approach contrasts with predominant methods-based surveys and facilitates the
identification of grand challenges to set the stage for new breakthroughs.Comment: survey, music generation, taxonomy, functional survey, survey,
automatic composition, algorithmic compositio
Machine listening, musicological analysis and the creative process : the production of song-based music influenced by augmented listening techniques
The creative process of making recordings in the popular music sphere is impossible to disconnect from the concept of influence. Whether practitioners are influenced consciously or subconsciously, and push toward, or away from their influences, they are shaped by the music they hear. This research-led practice project augments the influencing factors in the creation of an album of song-based music by foregrounding the listening process. This is approached by conducting an in-depth analysis of a set of tracks using a combined methodology integrating traditional popular music analysis techniques, with music information retrieval (MIR) tools. My methodology explores the novel applicability of these computational tools in a musicological context, with one goal being to show the value of machine listening in popular musicological research and the processes of composition and production. The emerging field of Digital Musicology takes advantage of big data and statistical analysis to allow for large scale observation and comparison of datasets in a way that would be unrealistic for one person to attempt without the aid of machine listening. Setting aside the intention and reception components of musicology, the goal of musical output suggests a feature-based approach is well suited to the task of investigating these methods. Utilising the musical ideas generated through the combined analysis of tracks compiled from the Billboard Alternative, year-end charts of 2011-2015, the songs written and recordings produced for the album Stay Still | Please Hear are a result of allowing a conscious subversion of my usual creative process through the expansion of my field of musical influence. This discursive component shows the development of my combined analysis methodology, highlights the points where creative influence occurred in the arrangement and production of Stay Still | Please Hear, emphasises the value of MIR tools in expanding the scope of musicological analysis, and demonstrates a unique approach to the development of artistic practice from the perspective of a creative practitioner
Biomechanical Modelling of Musical Performance: A Case Study of the Guitar
Merged with duplicate record 10026.1/2517 on 07.20.2017 by CS (TIS)Computer-generated musical performances are often criticised for being unable
to match the expressivity found in performances by humans. Much research
has been conducted in the past two decades in order to create computer
technology able to perform a given piece music as expressively as humans,
largely without success. Two approaches have been often adopted to research
into modelling expressive music performance on computers. The first focuses
on sound; that is, on modelling patterns of deviations between a recorded
human performance and the music score. The second focuses on modelling the
cognitive processes involved in a musical performance. Both approaches are
valid and can complement each other. In this thesis we propose a third
complementary approach, focusing on the guitar, which concerns the physical
manipulation of the instrument by the performer: a biomechanical approach.
The essence of this thesis is a study on capturing, analyzing and modelling
information about motor and biomechanical processes of guitar performance.
The focus is on speed, precision, and force of a guitarist's left-hand. The
overarching questions behind our study are:
1) Do unintentional actions originating from motor and biomechanical
functions during musical performance contribute a material "human feel"
to the performance?
2) Would it be possible determine and quantify such unintentional actions? 3) Would it be possible to model and embed such information in a computer
system?
The contributionst o knowledgep ursued in this thesis include:
a) An unprecedented study of guitar mechanics, ergonomics, and
playability;
b) A detailed study of how the human body performs actions when playing
the guitar;
c) A methodologyt o formally record quantifiable data about such actionsin
performance;
d) An approach to model such information, and
e) A demonstration of how the above knowledge can be embeddedin a
system for music performance
Pitch-Informed Solo and Accompaniment Separation
Das Thema dieser Dissertation ist die Entwicklung eines Systems zur
Tonhöhen-informierten Quellentrennung von Musiksignalen in Soloinstrument
und Begleitung. Dieses ist geeignet, die dominanten Instrumente aus einem
Musikstück zu isolieren, unabhängig von der Art des Instruments, der
Begleitung und Stilrichtung. Dabei werden nur einstimmige
Melodieinstrumente in Betracht gezogen. Die Musikaufnahmen liegen monaural
vor, es kann also keine zusätzliche Information aus der Verteilung der
Instrumente im Stereo-Panorama gewonnen werden.
Die entwickelte Methode nutzt Tonhöhen-Information als Basis für eine
sinusoidale Modellierung der spektralen Eigenschaften des Soloinstruments
aus dem Musikmischsignal. Anstatt die spektralen Informationen pro Frame zu
bestimmen, werden in der vorgeschlagenen Methode Tonobjekte fĂĽr die
Separation genutzt. Tonobjekt-basierte Verarbeitung ermöglicht es,
zusätzlich die Notenanfänge zu verfeinern, transiente Artefakte zu
reduzieren, gemeinsame Amplitudenmodulation (Common Amplitude Modulation
CAM) einzubeziehen und besser nichtharmonische Elemente der Töne
abzuschätzen. Der vorgestellte Algorithmus zur Quellentrennung von
Soloinstrument und Begleitung ermöglicht eine Echtzeitverarbeitung und ist
somit relevant fĂĽr den praktischen Einsatz.
Ein Experiment zur besseren Modellierung der Zusammenhänge zwischen
Magnitude, Phase und Feinfrequenz von isolierten Instrumententönen wurde
durchgeführt. Als Ergebnis konnte die Kontinuität der zeitlichen
Einhüllenden, die Inharmonizität bestimmter Musikinstrumente und die
Auswertung des Phasenfortschritts fĂĽr die vorgestellte Methode ausgenutzt
werden. Zusätzlich wurde ein Algorithmus für die Quellentrennung in
perkussive und harmonische Signalanteile auf Basis des Phasenfortschritts
entwickelt. Dieser erreicht ein verbesserte perzeptuelle Qualität der
harmonischen und perkussiven Signale gegenĂĽber vergleichbaren Methoden nach
dem Stand der Technik.
Die vorgestellte Methode zur Klangquellentrennung in Soloinstrument und
Begleitung wurde zu den Evaluationskampagnen SiSEC 2011 und SiSEC 2013
eingereicht. Dort konnten vergleichbare Ergebnisse im Hinblick auf
perzeptuelle Bewertungsmaße erzielt werden. Die Qualität eines
Referenzalgorithmus im Hinblick auf den in dieser Dissertation
beschriebenen Instrumentaldatensatz ĂĽbertroffen werden.
Als ein Anwendungsszenario fĂĽr die Klangquellentrennung in Solo und
Begleitung wurde ein Hörtest durchgeführt, der die Qualitätsanforderungen
an Quellentrennung im Kontext von Musiklernsoftware bewerten sollte. Die
Ergebnisse dieses Hörtests zeigen, dass die Solo- und Begleitspur gemäß
unterschiedlicher Qualitätskriterien getrennt werden sollten. Die
Musiklernsoftware Songs2See integriert die vorgestellte
Klangquellentrennung bereits in einer kommerziell erhältlichen Anwendung.This thesis addresses the development of a system for pitch-informed solo
and accompaniment separation capable of separating main instruments from
music accompaniment regardless of the musical genre of the track, or type
of music accompaniment. For the solo instrument, only pitched monophonic
instruments were considered in a single-channel scenario where no panning
or spatial location information is available.
In the proposed method, pitch information is used as an initial stage of a
sinusoidal modeling approach that attempts to estimate the spectral
information of the solo instrument from a given audio mixture. Instead of
estimating the solo instrument on a frame by frame basis, the proposed
method gathers information of tone objects to perform separation.
Tone-based processing allowed the inclusion of novel processing stages for
attack refinement, transient interference reduction, common amplitude
modulation (CAM) of tone objects, and for better estimation of non-harmonic
elements that can occur in musical instrument tones. The proposed solo and
accompaniment algorithm is an efficient method suitable for real-world
applications.
A study was conducted to better model magnitude, frequency, and phase of
isolated musical instrument tones. As a result of this study, temporal
envelope smoothness, inharmonicty of musical instruments, and phase
expectation were exploited in the proposed separation method. Additionally,
an algorithm for harmonic/percussive separation based on phase expectation
was proposed. The algorithm shows improved perceptual quality with respect
to state-of-the-art methods for harmonic/percussive separation.
The proposed solo and accompaniment method obtained perceptual quality
scores comparable to other state-of-the-art algorithms under the SiSEC 2011
and SiSEC 2013 campaigns, and outperformed the comparison algorithm on the
instrumental dataset described in this thesis.As a use-case of solo and
accompaniment separation, a listening test procedure was conducted to
assess separation quality requirements in the context of music education.
Results from the listening test showed that solo and accompaniment tracks
should be optimized differently to suit quality requirements of music
education. The Songs2See application was presented as commercial music
learning software which includes the proposed solo and accompaniment
separation method
- …