473 research outputs found
Estimation of Guitar Fingering and Plucking Controls based on Multimodal Analysis of Motion, Audio and Musical Score
This work presents a method for the extraction of instrumental controls during guitar performances. The method is based on the analysis of multimodal data consisting of a combination of motion capture, audio analysis and musical score. High speed video cameras based on marker identification are used to track the position of finger bones and articulations and audio is recorded with a transducer measuring vibration on the guitar body. The extracted parameters are divided into left hand controls, i.e. fingering (which string and fret is pressed with a left hand finger) and right hand controls, i.e. the plucked string, the plucking finger and the characteristics of the pluck (position, velocity and angles with respect to the string). Controls are estimated based on probability functions of low level features, namely, the plucking instants (i.e. note onsets), the pitch and the distances of the fingers (both hands) to strings and frets. Note onsets are detected via audio analysis, the pitch is extracted from the score and distances are computed from 3D Euclidean Geometry. Results show that by combination of multimodal information, it is possible to estimate such a comprehensive set of control features, with special high performance for the fingering and plucked string estimation. Regarding the plucking finger and the pluck characteristics, their accuracy gets lower but improvements are foreseen including a hand model and the use of high-speed cameras for calibration and evaluation.A. Perez-Carrillo was supported by a Beatriu de Pinos grant 2010 BP-A 00209 by the Catalan Research Agency (AGAUR) and J. Ll. Arcos was supported by ICT -2011-8-318770 and 2009-SGR-1434 projectsPeer reviewe
Modelling Instrumental Gestures and Techniques: A Case Study of Piano Pedalling
PhD ThesisIn this thesis we propose a bottom-up approach for modelling instrumental gestures and techniques, using piano pedalling as a case study. Pedalling gestures play a vital role in expressive piano performance. They can be categorised into di erent pedalling techniques. We propose several methods for the indirect acquisition of sustain-pedal techniques using audio signal analyses, complemented by the direct measurement of gestures with sensors. A novel measurement system is rst developed to synchronously collect pedalling gestures and piano sound. Recognition of pedalling techniques starts by using the gesture data. This yields high accuracy and facilitates the construction of a ground truth dataset for evaluating the audio-based pedalling detection algorithms. Studies in the audio domain rely on the knowledge of piano acoustics and physics. New audio features are designed through the analysis of isolated notes with di erent pedal e ects. The features associated with a measure of sympathetic resonance are used together with a machine learning classi er to detect the presence of legato-pedal onset in the recordings from a speci c piano. To generalise the detection, deep learning methods are proposed and investigated. Deep Neural Networks are trained using a large synthesised dataset obtained through a physical-modelling synthesiser for feature learning. Trained models serve as feature extractors for frame-wise sustain-pedal detection from acoustic piano recordings in a proposed transfer learning framework. Overall, this thesis demonstrates that recognising sustain-pedal techniques is possible to a high degree of accuracy using sensors and also from audio recordings alone. As the rst study that undertakes pedalling technique detection in real-world piano performance, it complements piano transcription methods. Moreover, the underlying relations between pedalling gestures, piano acoustics and audio features are identi ed. The varying e ectiveness of the presented features and models can also be explained by di erences in pedal use between composers and musical eras
Automatic Transcription of Bass Guitar Tracks applied for Music Genre Classification and Sound Synthesis
Musiksignale bestehen in der Regel aus einer Überlagerung mehrerer
Einzelinstrumente. Die meisten existierenden Algorithmen zur automatischen
Transkription und Analyse von Musikaufnahmen im Forschungsfeld des Music
Information Retrieval (MIR) versuchen, semantische Information direkt aus
diesen gemischten Signalen zu extrahieren. In den letzten Jahren wurde
häufig beobachtet, dass die Leistungsfähigkeit dieser Algorithmen durch
die SignalĂĽberlagerungen und den daraus resultierenden Informationsverlust
generell limitiert ist. Ein möglicher Lösungsansatz besteht darin,
mittels Verfahren der Quellentrennung die beteiligten Instrumente vor der
Analyse klanglich zu isolieren. Die Leistungsfähigkeit dieser Algorithmen
ist zum aktuellen Stand der Technik jedoch nicht immer ausreichend, um eine
sehr gute Trennung der Einzelquellen zu ermöglichen. In dieser Arbeit
werden daher ausschlieĂźlich isolierte Instrumentalaufnahmen untersucht,
die klanglich nicht von anderen Instrumenten ĂĽberlagert sind. Exemplarisch
werden anhand der elektrischen Bassgitarre auf die Klangerzeugung dieses
Instrumentes hin spezialisierte Analyse- und Klangsynthesealgorithmen
entwickelt und evaluiert.Im ersten Teil der vorliegenden Arbeit wird ein
Algorithmus vorgestellt, der eine automatische Transkription von
Bassgitarrenaufnahmen durchfĂĽhrt. Dabei wird das Audiosignal durch
verschiedene Klangereignisse beschrieben, welche den gespielten Noten auf
dem Instrument entsprechen. Neben den ĂĽblichen Notenparametern Anfang,
Dauer, Lautstärke und Tonhöhe werden dabei auch instrumentenspezifische
Parameter wie die verwendeten Spieltechniken sowie die Saiten- und Bundlage
auf dem Instrument automatisch extrahiert. Evaluationsexperimente anhand
zweier neu erstellter Audiodatensätze belegen, dass der vorgestellte
Transkriptionsalgorithmus auf einem Datensatz von realistischen
Bassgitarrenaufnahmen eine höhere Erkennungsgenauigkeit erreichen kann als
drei existierende Algorithmen aus dem Stand der Technik. Die Schätzung der
instrumentenspezifischen Parameter kann insbesondere fĂĽr isolierte
Einzelnoten mit einer hohen GĂĽte durchgefĂĽhrt werden.Im zweiten Teil der
Arbeit wird untersucht, wie aus einer Notendarstellung typischer sich
wieder- holender Basslinien auf das Musikgenre geschlossen werden kann.
Dabei werden Audiomerkmale extrahiert, welche verschiedene tonale,
rhythmische, und strukturelle Eigenschaften von Basslinien quantitativ
beschreiben. Mit Hilfe eines neu erstellten Datensatzes von 520 typischen
Basslinien aus 13 verschiedenen Musikgenres wurden drei verschiedene
Ansätze für die automatische Genreklassifikation verglichen. Dabei zeigte
sich, dass mit Hilfe eines regelbasierten Klassifikationsverfahrens nur
Anhand der Analyse der Basslinie eines MusikstĂĽckes bereits eine mittlere
Erkennungsrate von 64,8 % erreicht werden konnte.Die Re-synthese der
originalen Bassspuren basierend auf den extrahierten Notenparametern wird
im dritten Teil der Arbeit untersucht. Dabei wird ein neuer
Audiosynthesealgorithmus vorgestellt, der basierend auf dem Prinzip des
Physical Modeling verschiedene Aspekte der fĂĽr die Bassgitarre
charakteristische Klangerzeugung wie Saitenanregung, Dämpfung, Kollision
zwischen Saite und Bund sowie dem Tonabnehmerverhalten nachbildet.
Weiterhin wird ein parametrischerAudiokodierungsansatz diskutiert, der es
erlaubt, Bassgitarrenspuren nur anhand der ermittel- ten notenweisen
Parameter zu ĂĽbertragen um sie auf Dekoderseite wieder zu
resynthetisieren. Die Ergebnisse mehrerer Hötest belegen, dass der
vorgeschlagene Synthesealgorithmus eine Re- Synthese von
Bassgitarrenaufnahmen mit einer besseren Klangqualität ermöglicht als die
Ăśbertragung der Audiodaten mit existierenden Audiokodierungsverfahren, die
auf sehr geringe Bitraten ein gestellt sind.Music recordings most often consist of multiple instrument signals, which
overlap in time and frequency. In the field of Music Information Retrieval
(MIR), existing algorithms for the automatic transcription and analysis of
music recordings aim to extract semantic information from mixed audio
signals. In the last years, it was frequently observed that the algorithm
performance is limited due to the signal interference and the resulting
loss of information. One common approach to solve this problem is to first
apply source separation algorithms to isolate the present musical
instrument signals before analyzing them individually. The performance of
source separation algorithms strongly depends on the number of instruments
as well as on the amount of spectral overlap.In this thesis, isolated
instrumental tracks are analyzed in order to circumvent the challenges of
source separation. Instead, the focus is on the development of
instrument-centered signal processing algorithms for music transcription,
musical analysis, as well as sound synthesis. The electric bass guitar is
chosen as an example instrument. Its sound production principles are
closely investigated and considered in the algorithmic design.In the first
part of this thesis, an automatic music transcription algorithm for
electric bass guitar recordings will be presented. The audio signal is
interpreted as a sequence of sound events, which are described by various
parameters. In addition to the conventionally used score-level parameters
note onset, duration, loudness, and pitch, instrument-specific parameters
such as the applied instrument playing techniques and the geometric
position on the instrument fretboard will be extracted. Different
evaluation experiments confirmed that the proposed transcription algorithm
outperformed three state-of-the-art bass transcription algorithms for the
transcription of realistic bass guitar recordings. The estimation of the
instrument-level parameters works with high accuracy, in particular for
isolated note samples.In the second part of the thesis, it will be
investigated, whether the sole analysis of the bassline of a music piece
allows to automatically classify its music genre. Different score-based
audio features will be proposed that allow to quantify tonal, rhythmic, and
structural properties of basslines. Based on a novel data set of 520
bassline transcriptions from 13 different music genres, three approaches
for music genre classification were compared. A rule-based classification
system could achieve a mean class accuracy of 64.8 % by only taking
features into account that were extracted from the bassline of a music
piece.The re-synthesis of a bass guitar recordings using the previously
extracted note parameters will be studied in the third part of this thesis.
Based on the physical modeling of string instruments, a novel sound
synthesis algorithm tailored to the electric bass guitar will be presented.
The algorithm mimics different aspects of the instrument’s sound
production mechanism such as string excitement, string damping, string-fret
collision, and the influence of the electro-magnetic pickup. Furthermore, a
parametric audio coding approach will be discussed that allows to encode
and transmit bass guitar tracks with a significantly smaller bit rate than
conventional audio coding algorithms do. The results of different listening
tests confirmed that a higher perceptual quality can be achieved if the
original bass guitar recordings are encoded and re-synthesized using the
proposed parametric audio codec instead of being encoded using conventional
audio codecs at very low bit rate settings
Musical Gesture through the Human Computer Interface: An Investigation using Information Theory
This study applies information theory to investigate human ability to communicate using continuous control sensors with a particular focus on informing the design of digital musical instruments. There is an active practice of building and evaluating such instruments, for instance, in the New Interfaces for Musical Expression (NIME) conference community. The fidelity of the instruments can depend on the included sensors, and although much anecdotal evidence and craft experience informs the use of these sensors, relatively little is known about the ability of humans to control them accurately. This dissertation addresses this issue and related concerns, including continuous control performance in increasing degrees-of-freedom, pursuit tracking in comparison with pointing, and the estimations of musical interface designers and researchers of human performance with continuous control sensors. The methodology used models the human-computer system as an information channel while applying concepts from information theory to performance data collected in studies of human subjects using sensing devices. These studies not only add to knowledge about human abilities, but they also inform on issues in musical mappings, ergonomics, and usability
Magnetoencephalography for the investigation and diagnosis of Mild Traumatic Brain Injury
Mild Traumatic Brain Injury (mTBI), (or concussion), is the most common type of brain injury. Despite this, it often goes undiagnosed and can cause long term disability—most likely caused by the disruption of axonal connections in the brain. Objective methods for diagnosis and prognosis are needed but clinically available neuroimaging modalities rarely show structural abnormalities, even when patients suffer persisting functional deficits. In the past three decades, new powerful techniques to image brain structure and function have shown promise in detecting mTBI related changes. Magnetoencephalography (MEG), which measures electrical brain activity by detecting magnetic fields outside the head generated by neural currents, is particularly sensitive and has therefore gained interest from researchers. Numerous studies are proposing abnormal low-frequency neural oscillations and functional connectivity—the statistical interdependency of signals from separate brain regions—as potential biomarkers for mTBI. However, typically small sample sizes, the lack of replication between groups, the heterogeneity of the cohorts studied, and the lack of longitudinal studies impedes the adoption of MEG as a clinical tool in mTBI management. In particular, little is known about the acute phase of mTBI.
In this thesis, some of these gaps will be addressed by analysing MEG data from individuals with mTBI, using novel as well as conventional methods. The potential future of MEG in mTBI research will also be addressed by testing the capabilities of a wearable MEG system based on optically pumped magnetometers (OPMs).
The thesis contains three main experimental studies. In study 1, we investigated the signal dynamics underlying MEG abnormalities, found in a cohort of subjects scanned within three months of an mTBI, using a Hidden Markov Model (HMM), as growing evidence suggests that neural dynamics are (in part) driven by transient bursting events. Applying the HMM to resting-state data, we show that previously reported findings of diminished intrinsic beta amplitude and connectivity in individuals with mTBI (compared to healthy controls) can be explained by a reduction in the beta-band content of pan-spectral bursts and a loss in the temporal coincidence of bursts respectively. Using machine learning, we find the functional connections driving group differences and achieve classification accuracies of 98%. In a motor task, mTBI resulted in reduced burst amplitude, altered modulation of burst probability during movement and decreased connectivity in the motor network.
In study 2, we further test our HMM-based method in a cohort of subjects with mTBI and non-head trauma—scanned within two weeks of injury—to ensure specificity of any observed effects to mTBI and replicate our previous finding of reduced connectivity and high classification accuracy, although not the reduction in burst amplitude. Burst statistics were stable over both studies—despite data being acquired at different sites, using different scanners. In the same cohort, we applied a more conventional analysis of delta-band power. Although excess low-frequency power appears to be a promising candidate marker for persistently symptomatic mTBI, insufficient data exist to confirm this pattern in acute mTBI. We found abnormally high delta power to be a sensitive measure for discriminating mTBI subjects from healthy controls, however, similarly elevated delta amplitude was found in the cohort with non-head trauma, suggesting that excess delta may not be specific to mTBI, at least in the acute stage of injury.
Our work highlights the need for longitudinal assessment of mTBI. In addition, there appears to be a need to investigate naturalistic paradigms which can be tailored to induce activity in symptom-relevant brain networks and consequently are likely to be more sensitive biomarkers than the resting state scans used to date. Wearable OPM-MEG makes naturalistic scanning possible and may offer a cheaper and more accessible alternative to cryogenic MEG, however, before deploying OPMs clinically, or in pitch-side assessment for athletes, for example, the reliability of OPM-derived measures needs to be verified. In the third and final study, we performed a repeatability study using a novel motor task, estimating a series of common MEG measures and quantifying the reliability of both activity and connectivity derived from OPM-MEG data. These initial findings—presently limited to a small sample of healthy controls—demonstrate the utility of OPM-MEG and pave the way for this technology to be deployed on patients with mTBI
Magnetoencephalography for the investigation and diagnosis of Mild Traumatic Brain Injury
Mild Traumatic Brain Injury (mTBI), (or concussion), is the most common type of brain injury. Despite this, it often goes undiagnosed and can cause long term disability—most likely caused by the disruption of axonal connections in the brain. Objective methods for diagnosis and prognosis are needed but clinically available neuroimaging modalities rarely show structural abnormalities, even when patients suffer persisting functional deficits. In the past three decades, new powerful techniques to image brain structure and function have shown promise in detecting mTBI related changes. Magnetoencephalography (MEG), which measures electrical brain activity by detecting magnetic fields outside the head generated by neural currents, is particularly sensitive and has therefore gained interest from researchers. Numerous studies are proposing abnormal low-frequency neural oscillations and functional connectivity—the statistical interdependency of signals from separate brain regions—as potential biomarkers for mTBI. However, typically small sample sizes, the lack of replication between groups, the heterogeneity of the cohorts studied, and the lack of longitudinal studies impedes the adoption of MEG as a clinical tool in mTBI management. In particular, little is known about the acute phase of mTBI.
In this thesis, some of these gaps will be addressed by analysing MEG data from individuals with mTBI, using novel as well as conventional methods. The potential future of MEG in mTBI research will also be addressed by testing the capabilities of a wearable MEG system based on optically pumped magnetometers (OPMs).
The thesis contains three main experimental studies. In study 1, we investigated the signal dynamics underlying MEG abnormalities, found in a cohort of subjects scanned within three months of an mTBI, using a Hidden Markov Model (HMM), as growing evidence suggests that neural dynamics are (in part) driven by transient bursting events. Applying the HMM to resting-state data, we show that previously reported findings of diminished intrinsic beta amplitude and connectivity in individuals with mTBI (compared to healthy controls) can be explained by a reduction in the beta-band content of pan-spectral bursts and a loss in the temporal coincidence of bursts respectively. Using machine learning, we find the functional connections driving group differences and achieve classification accuracies of 98%. In a motor task, mTBI resulted in reduced burst amplitude, altered modulation of burst probability during movement and decreased connectivity in the motor network.
In study 2, we further test our HMM-based method in a cohort of subjects with mTBI and non-head trauma—scanned within two weeks of injury—to ensure specificity of any observed effects to mTBI and replicate our previous finding of reduced connectivity and high classification accuracy, although not the reduction in burst amplitude. Burst statistics were stable over both studies—despite data being acquired at different sites, using different scanners. In the same cohort, we applied a more conventional analysis of delta-band power. Although excess low-frequency power appears to be a promising candidate marker for persistently symptomatic mTBI, insufficient data exist to confirm this pattern in acute mTBI. We found abnormally high delta power to be a sensitive measure for discriminating mTBI subjects from healthy controls, however, similarly elevated delta amplitude was found in the cohort with non-head trauma, suggesting that excess delta may not be specific to mTBI, at least in the acute stage of injury.
Our work highlights the need for longitudinal assessment of mTBI. In addition, there appears to be a need to investigate naturalistic paradigms which can be tailored to induce activity in symptom-relevant brain networks and consequently are likely to be more sensitive biomarkers than the resting state scans used to date. Wearable OPM-MEG makes naturalistic scanning possible and may offer a cheaper and more accessible alternative to cryogenic MEG, however, before deploying OPMs clinically, or in pitch-side assessment for athletes, for example, the reliability of OPM-derived measures needs to be verified. In the third and final study, we performed a repeatability study using a novel motor task, estimating a series of common MEG measures and quantifying the reliability of both activity and connectivity derived from OPM-MEG data. These initial findings—presently limited to a small sample of healthy controls—demonstrate the utility of OPM-MEG and pave the way for this technology to be deployed on patients with mTBI
Discriminating music performers by timbre: On the relation between instrumental gesture, tone quality and perception in classical cello performance
Classical music performers use instruments to transform the symbolic notationof the score into sound which is ultimately perceived by a listener. For acoustic instruments, the timbre of the resulting sound is assumed to be strongly linked to the physical and acoustical properties of the instrument itself. However, rather little is known about how much influence the player has over the timbre of the sound — is it possible to discriminate music performers by timbre? This thesis explores player-dependent aspects of timbre, serving as an individual means of musical expression. With a research scope narrowed to analysis of solo cello recordings, the differences in tone quality of six performers who played the same musical excerpts on the same cello are investigated from three different perspectives: perceptual, acoustical and gestural. In order to understand how the physical actions that a performer exerts on an instrument affect spectro-temporal features of the sound produced, which then can be perceived as the player’s unique tone quality, a series of experiments are conducted, starting with the creation of dedicated multi-modal cello recordings extended by performance gesture information (bowing control parameters). In the first study, selected tone samples of six cellists are perceptually evaluated across various musical contexts via timbre dissimilarity and verbal attribute ratings. The spectro-temporal analysis follows in the second experiment, with the aim to identify acoustic features which best describe varying timbral characteristics of the players. Finally, in the third study, individual combinationsof bowing controls are examined in search for bowing patterns which might characterise each cellist regardless of the music being performed. The results show that the different players can be discriminated perceptually, by timbre, and that this perceptual discrimination can be projected back through the acoustical and gestural domains. By extending current understanding of human-instrument dependencies for qualitative tone production, this research may have further applications in computer-aided musical training and performer-informed instrumental sound synthesis.This work was supported by a UK EPSRC DTA studentship EP/P505054/1
and the EPSRC funded OMRAS2 project EP/E017614/1
Proceedings of the 7th Sound and Music Computing Conference
Proceedings of the SMC2010 - 7th Sound and Music Computing Conference, July 21st - July 24th 2010
- …