30 research outputs found
From Local to Global Analysis of Music Time Series
Local and more and more global musical structure is analyzed from audio time series by time-series-event analysis with the aim of automatic sheet music production and comparison of singers. Note events are determined and classified based on local spectra, and rules of bar events are identified based on accentuation events related to local energy. In order to compare the performances of different singers global summary measures are defined characterizing the overall performance. --
From Local to Global Analysis of Music Time Series
Local and more and more global musical structure is analyzed from audio time series by time-series-event analysis with the aim of automatic sheet music production and comparison of singers. Note events are determined and classified based on local spectra, and rules of bar events are identified based on accentuation events related to local energy. In order to compare the performances of different singers global summary measures are defined characterizing the overall performance
Segmentation of Identical and Simultaneously Played Traditional Music Instruments using Adaptive
Nowadays, mining of the musical ensemble has become very crucial since the information inside a musical ensemble is required by any musical contents services. In this research, we introduce Gamelan as one of the Indonesian traditional music instruments as our research objective. To indicate the changes of Gamelan features (i.e. tempo also the hammer struck styles) the segmentation of Gamelan music instruments is required as the music tagging tools. Adaptive LMS is employed for segmenting identical instruments that are played in the concurrent fashion. The target is to find how many instruments are played at the same time or separated by very short time (≤ 1 ms). The experiment results demonstrate robust detection with 0.02 ms accuracy for segmenting identical and simultaneously played Gamelan instruments. These results are employed for indicating the changes of Gamelan features, such as tempo also the hammer struck styles
Tipping points: a saxophone led investigation into three continua
This portfolio contains seven musical projects which explore various permutations of three continua: improvisation to composition, acoustic to technologized, and the ‘in’ to ‘out’. As a saxophonist, composer, and collaborator, I investigate the continua with various techniques and approaches, aiming to create contrasting contemporary jazz albums free of stylistic restrictions.
The albums explore the improvisation to composition continua via pieces that a) are freely improvised; b) contain written material used as a basis for improvisation; and c) transition between improvisation and composition in various ways. Aiming to enhance and augment acoustic saxophone performance, I explore various techniques for working with technology. Additionally, the commentary documents the overall use of technology including techniques utilised by my collaborators. The ‘in’ to ‘out’ continuum is investigated via a range of melodic, rhythmic, and time-feel approaches. The resulting vocabulary is applied throughout the projects in different ways.
The projects include two Roller Trio albums (Fracture and New Devices) and a film soundtrack (Promise/Threat), two quartet projects (The Earthworm’s Eye View and Ikigai), a saxophone and piano duet (Pebbles), and a collection of solo saxophone and electronics pieces (Solo). The commentary includes context, general techniques, processes, and detailed analyses of three contrasting pieces.
The portfolio demonstrates one of the infinite outcomes the continua can inspire and is an example of how a vocabulary made up of the set of techniques can be applied across different contexts
Analysis and resynthesis of polyphonic music
This thesis examines applications of Digital Signal Processing to the analysis, transformation, and resynthesis of musical audio. First I give an overview of the human perception of music. I then examine in detail the requirements for a system that can analyse, transcribe, process, and resynthesise monaural polyphonic music. I then describe and compare the possible hardware and software platforms. After this I describe a prototype hybrid system that attempts to carry out these tasks using a method based on additive synthesis. Next I present results from its application to a variety of musical examples, and critically assess its performance and limitations. I then address these issues in the design of a second system based on Gabor wavelets. I conclude by summarising the research and outlining suggestions for future developments
Automatisierte Extraktion rhythmischer Merkmale zur Anwendung in Music Information Retrieval-Systemen
This thesis describes the automated extraction of features for the
description of the rhythmic content of musical audio signals. These
features are selected with respect to their applicability in music
information retrieval (MIR) systems.
While research on automatic extraction of rhythmic features, for example
tempo and time signature has been in progress for some time, current
algorithms still seem to be a long way from matching human recognition
performance. Among the reasons of the difference between the performances
of a machine listening system and a trained listener are the use of
information on different levels of abstraction and musical knowledge in
human cognition. The approach described here is influenced by these two
principles of cognition.
In order to identify appropriate features and relevant aspects of human
processing of audio signals the necessary knowledge of musicology,
psychoacoustics and cognition science are described.Subsequently, the
description of the state-of-the-art comprises known methods for the
extraction of rhythmic features from musical audio signals. The main part
of the thesis contains a collection of machine-listening methods evaluating
information on different levels of abstraction. A compact representation of
metrical structure of musical audio signals is proposed.The evaluation of
low-level features enables the application of musical knowledge to a
minimal degree only. On the other hand it becomes apparent, that the
processing of high-level features is prone to errors due to the propagation
of the errors in the extraction process of this information. This motivates
the joint evaluation of low- and high-level information depending on their
reliability.
The extraction of rhythmic features from information of automated detected
percussive instruments represents a technical progress compared to the
state-of-the-art. The segmentation of the audio signals in characteristic
and similar regions representing verse or chorus for example is introduced
as a valuable pre-processing step. The achieved significant improvements of
the recognition rate are proved with real-world test data.
The performances of the developed methods are evaluated using a large
corpus of test data and the applicability of the extracted features for the
use in an exemplary MIR-system is examined.Das Thema dieser Dissertation ist die Extraktion von Merkmalen, die
rhythmische Eigenschaften von Audiosignalen beschreiben. Diese Merkmale
sind für die Anwendung in Music Information Retrieval (MIR)-Systemen
ausgewählt.
Obwohl in der Vergangenheit an der Extraktion rhythmischer Merkmale wie zum
Beispiel Tempo und Taktart in großem Umfang gearbeitet wurde, erreichen
aktuelle Verfahren nicht die Erkennungsleistung eines geübten Zuhörers.
Eine der Ursache dafür wird in der Auswertung von Informationen auf
unterschiedlichen Abstraktionsebenen beim Menschen vermutet, eine weitere
bei der Berücksichtigung von \mbox{musikalischem} Vorwissen. Der hier
beschriebene Ansatz orientiert sich an diesen Analyse\-mechanismen.
Zur Identifikation von geeigneten Merkmalen und relevanten Aspekten der
menschlichen Verarbeitung der Schallsignale werden Grundlagen aus
Musiktheorie, Psychoakustik und Kognitionswissenschaft erklärt. Bekannte
Verfahren zur Extraktion rhythmischer Merkmale werden in einer
ausführlichen Darstellung des Standes der Technik anschließend erläutert.
Der Hauptteil der Arbeit enthält eine Zusammenstellung von Verfahren des
maschinellen Hörens, die Informationen auf unterschiedlichen
Abstraktionsebenen auswerten. Eine kompakte Darstellung der metrischen
Struktur wird zur Ermittlung der metrischen Merkmale vorgestellt.Da
einerseits die Auswertung von Low-level-Merkmalen die Anwendung von
musikalischem Vorwissen nur in geringen Maß ermöglicht, und andererseits
die Informationen auf höheren Abstraktionsebenen durch ihre
Fehlerhaftigkeit die Erkennungsleistung in verschiedenen Situationen
einschränken können, werden die Ergebnisse der verschiedenen Verfahren in
Abhängigkeit ihrer Konfidenzmaße zu einem Gesamtergebnis zusammengefasst.
Die Extraktion von rhythmischen Merkmalen aus den Informationen maschinell
detektierter perkussiver Instrumente stellt einen Fortschritt im Vergleich
zu bekannten Arbeiten dar. Eine Segmentierung in charakteristische
Abschnitte des Audiosignals, die zum Beispiel Strophe oder Refrain
repräsentieren, wird als Vorverarbeitungsschritt zur Analyse vorgestellt
und die dadurch erreichte signifikante Verbesserung der
Erkennungs\-leistung nachgewiesen.
Die Leistungsfähigkeit der Verfahren wird anhand eines umfangreichen
Testdatensatzes evaluiert und die Eignung der extrahierten Merkmale in
einem MIR-System untersucht
Methodology for the production and delivery of generative music for the personal listener : systems for realtime generative music production
This thesis will describe a system for the production of generative music
through specific methodology, and provide an approach for the delivery of this
material. The system and body of work will be targeted specifically at the personal
listening audience. As the largest current consumer of music in all genres of
music, this represents the largest and most applicable market to develop such a
system for. By considering how recorded media compares to concert performance,
it is possible to ascertain which attributes of performance may be translated to
a generative media. In addition, an outline of how fixed media has changed how
people listen to music directly will be considered. By looking at these concepts
an attempt is made to create a system which satisfies societies need for music
which is not only commodified and easily approached, but also closes the qualitative
gap between a static delivery medium and concert based output. This is
approached within the context of contemporary classical music. Furthermore, by
considering the development and fragmentation of the personal listening audience
through technological developments, a methodology for the delivery of generative
media to a range of devices will be investigated. A body of musical work will
be created which attempts to realise these goals in a qualitative fashion. These
works will span the development of the composition methodology, and the algorithmic
methods covered. A conclusion based on the possibilities of each system
with regard to its qualitative output will form the basis for evaluation. As this
investigation is seated within the field of music, the musical output and composition
methodology will be considered as the primary deciding factor of a system's
feasibility. The contribution of this research to the field will be a methodology for
the composition and production of algorithmic music in realtime, and a feasible
method for the delivery of this music to a wide audience
Emergent Rhythmic Structures as Cultural Phenomena Driven by Social Pressure in a Society of Artificial Agents
This thesis studies rhythm from an evolutionary computation perspective. Rhythm is the most fundamental dimension of music and can be used as a ground to describe the evolution of music. More specifically, the main goal of the thesis is to investigate how complex rhythmic structures evolve, subject to the cultural transmission between individuals in a society. The study is developed by means of computer modelling and simulations informed by evolutionary computation and artificial life (A-Life). In this process, self-organisation plays a fundamental role. The evolutionary process is steered by the evaluation of rhythmic complexity and by the exposure to rhythmic material.
In this thesis, composers and musicologists will find the description of a system named A-Rhythm, which explores the emerged behaviours in a community of artificial autonomous agents that interact in a virtual environment. The interaction between the agents takes the form of imitation games.
A set of necessary criteria was established for the construction of a compositional system in which cultural transmission is observed. These criteria allowed the comparison with related work in the field of evolutionary computation and music.
In the development of the system, rhythmic representation is discussed. The proposed representation enabled the development of complexity and similarity based measures, and the recombination of rhythms in a creative manner. A-Rhythm produced results in the form of simulation data which were evaluated in terms of the coherence of repertoires of the agents. The data shows how rhythmic sequences are changed and sustained in the population, displaying synchronic and diachronic diversity. Finally, this tool was used as a generative mechanism for composition and several examples are presented.Leverhulme Trus
Applications of Discriminative, Generative and Predictive Deep Learning Processes to Solo Saxophone Practice
Modelling of audio data through deep learning provides a means of creating novel sounds, processes, ideas and tools for musical creativity, yet its actual usefulness is relatively under-explored. Only a handful of researcher-practitioners are using AI models in their musical works, and artistic research into applications of deep learning modelling to instrumental practice and improvisation currently occupies an even smaller niche.
The research presented in this thesis and accompanying portfolio is an examination of potential creative applications of statistical modelling of audio data, through deep learning processes, to instrumental music practice; these processes are classification of a live input, generation of raw audio samples and sequential prediction of pitch. The goal of this work is, through the development of processes and creation of musical works, to generate knowledge concerning the practicality of modelling the systematic aspects of an instrumental improvised practice, the creative usefulness of such models to the practitioner, and the
musical and technical ‘behaviours’ of specific classes of deep learning architecture with respect to the data on which the models are trained.
These concerns are addressed through a practice-based research methodology consisting of multiple steps: recording original audio datasets; pre-processing audio data as appropriate to model architecture and task; training statistical models; artistic experimentation and development of software, resulting in novel processes for musical creativity; and creation of artistic outputs, resulting in a portfolio of recordings and notated scores.
This project finds that deep learning can play useful roles in both technical and creative musical processes: classification can not only form the basis of interactive systems for improvisation but also be suggestive of new compositional structures; outputs of generative models of raw audio not only return valuable information about the training data but also generate useful source material for technical instrumental practice, improvisation and composition; notated outputs from symbolic-domain predictive models can also be richly suggestive of compositional ideas and structures for electroacoustic improvisation. This rich diversity of
applications found posits AI as creative assistant, teacher and as deeply personalised tool for the instrumental practitioner.
When considering the utility of this work to others, there will be specific variances not covered by this project: appropriate choices of data representations, data-preprocessing techniques, model architectures and their training parameters will vary according to task, instrument, genre and taste, as will of course the character of others’ creative outputs. However, the abundance of affordances and future directions this work uncovers gives confidence of its utility for other instrumental practitioners and researchers.
Given the pace of ongoing development of deep learning methods for modelling of audio and their still-limited adoption by creative practitioners, I hope that this thesis will motivate further explorations of the unique creative potential of these technologies by instrumental practitioners, improvisers and practice-based researchers in the wider field of AI for musical creativity