63 research outputs found
Progressive Filtering Using Multiresolution Histograms for Query by Humming System
The rising availability of digital music stipulates effective categorization
and retrieval methods. Real world scenarios are characterized by mammoth music
collections through pertinent and non-pertinent songs with reference to the
user input. The primary goal of the research work is to counter balance the
perilous impact of non-relevant songs through Progressive Filtering (PF) for
Query by Humming (QBH) system. PF is a technique of problem solving through
reduced space. This paper presents the concept of PF and its efficient design
based on Multi-Resolution Histograms (MRH) to accomplish searching in
manifolds. Initially the entire music database is searched to obtain high
recall rate and narrowed search space. Later steps accomplish slow search in
the reduced periphery and achieve additional accuracy.
Experimentation on large music database using recursive programming
substantiates the potential of the method. The outcome of proposed strategy
glimpses that MRH effectively locate the patterns. Distances of MRH at lower
level are the lower bounds of the distances at higher level, which guarantees
evasion of false dismissals during PF. In due course, proposed method helps to
strike a balance between efficiency and effectiveness. The system is scalable
for large music retrieval systems and also data driven for performance
optimization as an added advantage.Comment: 12 Pages, 6 Figures, Full version of the paper published at
ICMCCA-2012 with the same title,
Link:http://link.springer.com/chapter/10.1007/978-81-322-1143-3_2
Making music through real-time voice timbre analysis: machine learning and timbral control
PhDPeople can achieve rich musical expression through vocal sound { see for example
human beatboxing, which achieves a wide timbral variety through a range of
extended techniques. Yet the vocal modality is under-exploited as a controller
for music systems. If we can analyse a vocal performance suitably in real time,
then this information could be used to create voice-based interfaces with the
potential for intuitive and ful lling levels of expressive control.
Conversely, many modern techniques for music synthesis do not imply any
particular interface. Should a given parameter be controlled via a MIDI keyboard,
or a slider/fader, or a rotary dial? Automatic vocal analysis could provide
a fruitful basis for expressive interfaces to such electronic musical instruments.
The principal questions in applying vocal-based control are how to extract
musically meaningful information from the voice signal in real time, and how
to convert that information suitably into control data. In this thesis we address
these questions, with a focus on timbral control, and in particular we
develop approaches that can be used with a wide variety of musical instruments
by applying machine learning techniques to automatically derive the mappings
between expressive audio input and control output. The vocal audio signal is
construed to include a broad range of expression, in particular encompassing
the extended techniques used in human beatboxing.
The central contribution of this work is the application of supervised and
unsupervised machine learning techniques to automatically map vocal timbre
to synthesiser timbre and controls. Component contributions include a delayed
decision-making strategy for low-latency sound classi cation, a regression-tree
method to learn associations between regions of two unlabelled datasets, a fast
estimator of multidimensional di erential entropy and a qualitative method for
evaluating musical interfaces based on discourse analysis
The Value of Seizure Semiology in Epilepsy Surgery: Epileptogenic-Zone Localisation in Presurgical Patients using Machine Learning and Semiology Visualisation Tool
Background
Eight million individuals have focal drug resistant epilepsy worldwide. If their epileptogenic focus is identified and resected, they may become seizure-free and experience significant improvements in quality of life. However, seizure-freedom occurs in less than half of surgical resections.
Seizure semiology - the signs and symptoms during a seizure - along with brain imaging and electroencephalography (EEG) are amongst the mainstays of seizure localisation. Although there have been advances in algorithmic identification of abnormalities on EEG and imaging, semiological analysis has remained more subjective.
The primary objective of this research was to investigate the localising value of clinician-identified semiology, and secondarily to improve personalised prognostication for epilepsy surgery.
Methods
I data mined retrospective hospital records to link semiology to outcomes. I trained machine learning models to predict temporal lobe epilepsy (TLE) and determine the value of semiology compared to a benchmark of hippocampal sclerosis (HS).
Due to the hospital dataset being relatively small, we also collected data from a systematic review of the literature to curate an open-access Semio2Brain database. We built the Semiology-to-Brain Visualisation Tool (SVT) on this database and retrospectively validated SVT in two separate groups of randomly selected patients and individuals with frontal lobe epilepsy.
Separately, a systematic review of multimodal prognostic features of epilepsy surgery was undertaken.
The concept of a semiological connectome was devised and compared to structural connectivity to investigate probabilistic propagation and semiology generation.
Results
Although a (non-chronological) list of patients’ semiologies did not improve localisation beyond the initial semiology, the list of semiology added value when combined with an imaging feature. The absolute added value of semiology in a support vector classifier in diagnosing TLE, compared to HS, was 25%. Semiology was however unable to predict postsurgical outcomes. To help future prognostic models, a list of essential multimodal prognostic features for epilepsy surgery were extracted from meta-analyses and a structural causal model proposed.
Semio2Brain consists of over 13000 semiological datapoints from 4643 patients across 309 studies and uniquely enabled a Bayesian approach to localisation to mitigate TLE publication bias. SVT performed well in a retrospective validation, matching the best expert clinician’s localisation scores and exceeding them for lateralisation, and showed modest value in localisation in individuals with frontal lobe epilepsy (FLE).
There was a significant correlation between the number of connecting fibres between brain regions and the seizure semiologies that can arise from these regions.
Conclusions
Semiology is valuable in localisation, but multimodal concordance is more valuable and highly prognostic. SVT could be suitable for use in multimodal models to predict the seizure focus
Signal Processing Methods for Music Synchronization, Audio Matching, and Source Separation
The field of music information retrieval (MIR) aims at developing techniques and tools for organizing, understanding, and searching multimodal information in large music collections in a robust, efficient and intelligent manner. In this context, this thesis presents novel, content-based methods for music synchronization, audio matching, and source separation. In general, music synchronization denotes a procedure which, for a given position in one representation of a piece of music, determines the corresponding position within another representation. Here, the thesis presents three complementary synchronization approaches, which improve upon previous methods in terms of robustness, reliability, and accuracy. The first approach employs a late-fusion strategy based on multiple, conceptually different alignment techniques to identify those music passages that allow for reliable alignment results. The second approach is based on the idea of employing musical structure analysis methods in the context of synchronization to derive reliable synchronization results even in the presence of structural differences between the versions to be aligned. Finally, the third approach employs several complementary strategies for increasing the accuracy and time resolution of synchronization results. Given a short query audio clip, the goal of audio matching is to automatically retrieve all musically similar excerpts in different versions and arrangements of the same underlying piece of music. In this context, chroma-based audio features are a well-established tool as they possess a high degree of invariance to variations in timbre. This thesis describes a novel procedure for making chroma features even more robust to changes in timbre while keeping their discriminative power. Here, the idea is to identify and discard timbre-related information using techniques inspired by the well-known MFCC features, which are usually employed in speech processing. Given a monaural music recording, the goal of source separation is to extract musically meaningful sound sources corresponding, for example, to a melody, an instrument, or a drum track from the recording. To facilitate this complex task, one can exploit additional information provided by a musical score. Based on this idea, this thesis presents two novel, conceptually different approaches to source separation. Using score information provided by a given MIDI file, the first approach employs a parametric model to describe a given audio recording of a piece of music. The resulting model is then used to extract sound sources as specified by the score. As a computationally less demanding and easier to implement alternative, the second approach employs the additional score information to guide a decomposition based on non-negative matrix factorization (NMF)
A computational framework for sound segregation in music signals
Tese de doutoramento. Engenharia Electrotécnica e de Computadores. Faculdade de Engenharia. Universidade do Porto. 200
The nation thing? ‘Enjoyment and well-being’ in the production of cultural spaces in Zimbabwean literature
A thesis submitted in fulfilment of the requirements for the degree of
Doctor of Philosophy in English Studies
University of the Witwatersrand, Johannesburg 2018My thesis develops a frame of understanding of the aesthetic, cultural well-being and enjoyment spaces in Zimbabwean literature through a close reading of selected texts (both fiction and autobiographical narratives). I argue that Zimbabwean literature demonstrates the expanse of ‘enjoyment’ beyond the material. Findings from my analysis establish the position that people create and flourish in enjoyable identity conferring spaces of their own choice, and generate an intercultural mosaic in the process. Unlike the standard criticism of Zimbabwean literature which focuses on wars, trauma, memory, interminable gender struggles, binaries of the city and country, dispossessions and repossessions, the “Zimbabwean crisis” and the diaspora, I explore enjoyment and well-being. I establish that the intercultural nature of “enjoyment and well-being” spaces designates a fractured cosmopolitanism in which classificatory variables like gender, race and ethnicity are problematised. I interrogate enjoyment and well-being that is predicated on the pain and suffering of a scripted and choked “Other”, whichever name that individual may be called: stranger, alien, refugee, migrant and settler among some “Othering” concepts. The subject that Zimbabwean literature establishes has the capacity to enjoy in multifarious ways that which fosters intersubjectivity. People from diverse backgrounds, sexes, and ethnicities find joy, happiness and pleasure in various spaces of interaction or “contact zones” which are identity conferring.
The research foregrounds cultural sites in four parts: the land, the body, city spaces and diaspora spaces. Part one considers land as the locus of analysis in the explication of Zimbabwean subjectivities, since land is often deployed as the “discursive threshold” after the Post-2000 land invasions/reforms. I establish the paucity and inadequacy of a conceptualisation of land that derives identities from binaries that designate the Self and Other. I proceed to explore rhythms and textures in nature to demonstrate the richness and inter-subjectivity in the way land, animals, the cosmos and human life are intertwined.
Part two demonstrates that the individual body is at the centre of generating its own data and negotiating meanings as physical sensations are expanded to inter-human sensation, contrary to the nation-state’s concept of fashioning subjects.
Part three considers city spaces as rendering the atmosphere and environment for subject enjoyment, well-being and authenticity. Beyond and above the sites that are bound by territorial borders, I argue that Zimbabwean subjects enjoy transcendental and diaspora spaces.
Part four explores transcendental spaces of enjoyment and well-being at the level of both the individual human mind and nation-spaces. The rise of cellphones, the Ipad, computers and the semiotics of the big and small screens introduce a mind that is able to transcend the exigencies of place through memorialisation, imagi(ni)ng, pictures, ritual and religion. Texts explored in part four demonstrate that people are able to negotiate spaces and places through remediation and travel, both physically and metaphorically, thus breaching the territorial borders of the nation-state.
The study suggests the creation and sustenance of climates for various orders of joy, enjoyment and pleasure by nation-states which should desist from dictating the way people enjoy for them to maintain legitimacy. The research underscores the importance of enjoyment and well-being in the configuration of nation-spaces at any given time. This research foregrounds an African response to the global scholarship on the constitution of nation-spaces through the tropes of enjoyment and well-being.XL201
Models and Analysis of Vocal Emissions for Biomedical Applications
The International Workshop on Models and Analysis of Vocal Emissions for Biomedical Applications (MAVEBA) came into being in 1999 from the particularly felt need of sharing know-how, objectives and results between areas that until then seemed quite distinct such as bioengineering, medicine and singing. MAVEBA deals with all aspects concerning the study of the human voice with applications ranging from the neonate to the adult and elderly. Over the years the initial issues have grown and spread also in other aspects of research such as occupational voice disorders, neurology, rehabilitation, image and video analysis. MAVEBA takes place every two years always in Firenze, Italy. This edition celebrates twenty years of uninterrupted and succesfully research in the field of voice analysis
Dynamical and topological tools for (modern) music analysis
Is it possible to represent the horizontal motions of the melodic strands of a contrapuntal composition, or the main ideas of a jazz standard as mathematical entities? In this work, we suggest a collection of novel models for the representation of music that are endowed with two main features. First, they originate from a topological and geometrical inspiration; second, their low dimensionality allows to build simple and informative visualisations.
Here, we tackle the problem of music representation following three non-orthogonal directions. We suggest a formalisation of the concept of voice leading (the assignment of an instrument to each voice in a sequence of chords) suggesting a horizontal viewpoint on music, constituted by the simultaneous motions of superposed melodies. This formalisation naturally leads to the interpretation of counterpoint as a multivariate time series of partial permutation matrices, whose observations are characterised by a degree of complexity. After providing both a static and a dynamic representation of counterpoint, voice leadings are reinterpreted as a special class of partial singular braids (paths in the Euclidean space), and their main features are visualised as geometric configurations of collections of 3-dimensional strands.
Thereafter, we neglect this time-related information, in order to reduce the problem to the study of vertical musical entities. The model we propose is derived from a topological interpretation of the Tonnetz (a graph commonly used in computational musicology) and the deformation of its vertices induced by a harmonic and a consonance-oriented function, respectively. The 3-dimensional shapes derived from these deformations are classified using the formalism of persistent homology. This powerful topological technique allows to compute a fingerprint of a shape, that reflects its persistent geometrical and topological properties. Furthermore, it is possible to compute a distance between these fingerprints and hence study their hierarchical organisation. This particular feature allows us to tackle the problem of automatic classification of music in an innovative way. Thus, this novel representation of music is evaluated on a collection of heterogenous musical datasets.
Finally, a combination of the two aforementioned approaches is proposed. A model at the crossroad between the signal and symbolic analysis of music uses multiple sequences alignment to provide an encompassing, novel viewpoint on the musical inspiration transfer among compositions belonging to different artists, genres and time. To conclude, we shall represent music as a time series of topological fingerprints, whose metric nature allows to compare pairs of time-varying shapes in both topological and in musical terms. In particular the dissimilarity scores computed by aligning such sequences shall be applied both to the analysis and classification of music
- …