63 research outputs found

    Progressive Filtering Using Multiresolution Histograms for Query by Humming System

    Full text link
    The rising availability of digital music stipulates effective categorization and retrieval methods. Real world scenarios are characterized by mammoth music collections through pertinent and non-pertinent songs with reference to the user input. The primary goal of the research work is to counter balance the perilous impact of non-relevant songs through Progressive Filtering (PF) for Query by Humming (QBH) system. PF is a technique of problem solving through reduced space. This paper presents the concept of PF and its efficient design based on Multi-Resolution Histograms (MRH) to accomplish searching in manifolds. Initially the entire music database is searched to obtain high recall rate and narrowed search space. Later steps accomplish slow search in the reduced periphery and achieve additional accuracy. Experimentation on large music database using recursive programming substantiates the potential of the method. The outcome of proposed strategy glimpses that MRH effectively locate the patterns. Distances of MRH at lower level are the lower bounds of the distances at higher level, which guarantees evasion of false dismissals during PF. In due course, proposed method helps to strike a balance between efficiency and effectiveness. The system is scalable for large music retrieval systems and also data driven for performance optimization as an added advantage.Comment: 12 Pages, 6 Figures, Full version of the paper published at ICMCCA-2012 with the same title, Link:http://link.springer.com/chapter/10.1007/978-81-322-1143-3_2

    Making music through real-time voice timbre analysis: machine learning and timbral control

    Get PDF
    PhDPeople can achieve rich musical expression through vocal sound { see for example human beatboxing, which achieves a wide timbral variety through a range of extended techniques. Yet the vocal modality is under-exploited as a controller for music systems. If we can analyse a vocal performance suitably in real time, then this information could be used to create voice-based interfaces with the potential for intuitive and ful lling levels of expressive control. Conversely, many modern techniques for music synthesis do not imply any particular interface. Should a given parameter be controlled via a MIDI keyboard, or a slider/fader, or a rotary dial? Automatic vocal analysis could provide a fruitful basis for expressive interfaces to such electronic musical instruments. The principal questions in applying vocal-based control are how to extract musically meaningful information from the voice signal in real time, and how to convert that information suitably into control data. In this thesis we address these questions, with a focus on timbral control, and in particular we develop approaches that can be used with a wide variety of musical instruments by applying machine learning techniques to automatically derive the mappings between expressive audio input and control output. The vocal audio signal is construed to include a broad range of expression, in particular encompassing the extended techniques used in human beatboxing. The central contribution of this work is the application of supervised and unsupervised machine learning techniques to automatically map vocal timbre to synthesiser timbre and controls. Component contributions include a delayed decision-making strategy for low-latency sound classi cation, a regression-tree method to learn associations between regions of two unlabelled datasets, a fast estimator of multidimensional di erential entropy and a qualitative method for evaluating musical interfaces based on discourse analysis

    The Value of Seizure Semiology in Epilepsy Surgery: Epileptogenic-Zone Localisation in Presurgical Patients using Machine Learning and Semiology Visualisation Tool

    Get PDF
    Background Eight million individuals have focal drug resistant epilepsy worldwide. If their epileptogenic focus is identified and resected, they may become seizure-free and experience significant improvements in quality of life. However, seizure-freedom occurs in less than half of surgical resections. Seizure semiology - the signs and symptoms during a seizure - along with brain imaging and electroencephalography (EEG) are amongst the mainstays of seizure localisation. Although there have been advances in algorithmic identification of abnormalities on EEG and imaging, semiological analysis has remained more subjective. The primary objective of this research was to investigate the localising value of clinician-identified semiology, and secondarily to improve personalised prognostication for epilepsy surgery. Methods I data mined retrospective hospital records to link semiology to outcomes. I trained machine learning models to predict temporal lobe epilepsy (TLE) and determine the value of semiology compared to a benchmark of hippocampal sclerosis (HS). Due to the hospital dataset being relatively small, we also collected data from a systematic review of the literature to curate an open-access Semio2Brain database. We built the Semiology-to-Brain Visualisation Tool (SVT) on this database and retrospectively validated SVT in two separate groups of randomly selected patients and individuals with frontal lobe epilepsy. Separately, a systematic review of multimodal prognostic features of epilepsy surgery was undertaken. The concept of a semiological connectome was devised and compared to structural connectivity to investigate probabilistic propagation and semiology generation. Results Although a (non-chronological) list of patients’ semiologies did not improve localisation beyond the initial semiology, the list of semiology added value when combined with an imaging feature. The absolute added value of semiology in a support vector classifier in diagnosing TLE, compared to HS, was 25%. Semiology was however unable to predict postsurgical outcomes. To help future prognostic models, a list of essential multimodal prognostic features for epilepsy surgery were extracted from meta-analyses and a structural causal model proposed. Semio2Brain consists of over 13000 semiological datapoints from 4643 patients across 309 studies and uniquely enabled a Bayesian approach to localisation to mitigate TLE publication bias. SVT performed well in a retrospective validation, matching the best expert clinician’s localisation scores and exceeding them for lateralisation, and showed modest value in localisation in individuals with frontal lobe epilepsy (FLE). There was a significant correlation between the number of connecting fibres between brain regions and the seizure semiologies that can arise from these regions. Conclusions Semiology is valuable in localisation, but multimodal concordance is more valuable and highly prognostic. SVT could be suitable for use in multimodal models to predict the seizure focus

    Signal Processing Methods for Music Synchronization, Audio Matching, and Source Separation

    Get PDF
    The field of music information retrieval (MIR) aims at developing techniques and tools for organizing, understanding, and searching multimodal information in large music collections in a robust, efficient and intelligent manner. In this context, this thesis presents novel, content-based methods for music synchronization, audio matching, and source separation. In general, music synchronization denotes a procedure which, for a given position in one representation of a piece of music, determines the corresponding position within another representation. Here, the thesis presents three complementary synchronization approaches, which improve upon previous methods in terms of robustness, reliability, and accuracy. The first approach employs a late-fusion strategy based on multiple, conceptually different alignment techniques to identify those music passages that allow for reliable alignment results. The second approach is based on the idea of employing musical structure analysis methods in the context of synchronization to derive reliable synchronization results even in the presence of structural differences between the versions to be aligned. Finally, the third approach employs several complementary strategies for increasing the accuracy and time resolution of synchronization results. Given a short query audio clip, the goal of audio matching is to automatically retrieve all musically similar excerpts in different versions and arrangements of the same underlying piece of music. In this context, chroma-based audio features are a well-established tool as they possess a high degree of invariance to variations in timbre. This thesis describes a novel procedure for making chroma features even more robust to changes in timbre while keeping their discriminative power. Here, the idea is to identify and discard timbre-related information using techniques inspired by the well-known MFCC features, which are usually employed in speech processing. Given a monaural music recording, the goal of source separation is to extract musically meaningful sound sources corresponding, for example, to a melody, an instrument, or a drum track from the recording. To facilitate this complex task, one can exploit additional information provided by a musical score. Based on this idea, this thesis presents two novel, conceptually different approaches to source separation. Using score information provided by a given MIDI file, the first approach employs a parametric model to describe a given audio recording of a piece of music. The resulting model is then used to extract sound sources as specified by the score. As a computationally less demanding and easier to implement alternative, the second approach employs the additional score information to guide a decomposition based on non-negative matrix factorization (NMF)

    A computational framework for sound segregation in music signals

    Get PDF
    Tese de doutoramento. Engenharia Electrotécnica e de Computadores. Faculdade de Engenharia. Universidade do Porto. 200

    Essentials of Business Analytics

    Get PDF

    The nation thing? ‘Enjoyment and well-being’ in the production of cultural spaces in Zimbabwean literature

    Get PDF
    A thesis submitted in fulfilment of the requirements for the degree of Doctor of Philosophy in English Studies University of the Witwatersrand, Johannesburg 2018My thesis develops a frame of understanding of the aesthetic, cultural well-being and enjoyment spaces in Zimbabwean literature through a close reading of selected texts (both fiction and autobiographical narratives). I argue that Zimbabwean literature demonstrates the expanse of ‘enjoyment’ beyond the material. Findings from my analysis establish the position that people create and flourish in enjoyable identity conferring spaces of their own choice, and generate an intercultural mosaic in the process. Unlike the standard criticism of Zimbabwean literature which focuses on wars, trauma, memory, interminable gender struggles, binaries of the city and country, dispossessions and repossessions, the “Zimbabwean crisis” and the diaspora, I explore enjoyment and well-being. I establish that the intercultural nature of “enjoyment and well-being” spaces designates a fractured cosmopolitanism in which classificatory variables like gender, race and ethnicity are problematised. I interrogate enjoyment and well-being that is predicated on the pain and suffering of a scripted and choked “Other”, whichever name that individual may be called: stranger, alien, refugee, migrant and settler among some “Othering” concepts. The subject that Zimbabwean literature establishes has the capacity to enjoy in multifarious ways that which fosters intersubjectivity. People from diverse backgrounds, sexes, and ethnicities find joy, happiness and pleasure in various spaces of interaction or “contact zones” which are identity conferring. The research foregrounds cultural sites in four parts: the land, the body, city spaces and diaspora spaces. Part one considers land as the locus of analysis in the explication of Zimbabwean subjectivities, since land is often deployed as the “discursive threshold” after the Post-2000 land invasions/reforms. I establish the paucity and inadequacy of a conceptualisation of land that derives identities from binaries that designate the Self and Other. I proceed to explore rhythms and textures in nature to demonstrate the richness and inter-subjectivity in the way land, animals, the cosmos and human life are intertwined. Part two demonstrates that the individual body is at the centre of generating its own data and negotiating meanings as physical sensations are expanded to inter-human sensation, contrary to the nation-state’s concept of fashioning subjects. Part three considers city spaces as rendering the atmosphere and environment for subject enjoyment, well-being and authenticity. Beyond and above the sites that are bound by territorial borders, I argue that Zimbabwean subjects enjoy transcendental and diaspora spaces. Part four explores transcendental spaces of enjoyment and well-being at the level of both the individual human mind and nation-spaces. The rise of cellphones, the Ipad, computers and the semiotics of the big and small screens introduce a mind that is able to transcend the exigencies of place through memorialisation, imagi(ni)ng, pictures, ritual and religion. Texts explored in part four demonstrate that people are able to negotiate spaces and places through remediation and travel, both physically and metaphorically, thus breaching the territorial borders of the nation-state. The study suggests the creation and sustenance of climates for various orders of joy, enjoyment and pleasure by nation-states which should desist from dictating the way people enjoy for them to maintain legitimacy. The research underscores the importance of enjoyment and well-being in the configuration of nation-spaces at any given time. This research foregrounds an African response to the global scholarship on the constitution of nation-spaces through the tropes of enjoyment and well-being.XL201

    Models and Analysis of Vocal Emissions for Biomedical Applications

    Get PDF
    The International Workshop on Models and Analysis of Vocal Emissions for Biomedical Applications (MAVEBA) came into being in 1999 from the particularly felt need of sharing know-how, objectives and results between areas that until then seemed quite distinct such as bioengineering, medicine and singing. MAVEBA deals with all aspects concerning the study of the human voice with applications ranging from the neonate to the adult and elderly. Over the years the initial issues have grown and spread also in other aspects of research such as occupational voice disorders, neurology, rehabilitation, image and video analysis. MAVEBA takes place every two years always in Firenze, Italy. This edition celebrates twenty years of uninterrupted and succesfully research in the field of voice analysis

    Dynamical and topological tools for (modern) music analysis

    Get PDF
    Is it possible to represent the horizontal motions of the melodic strands of a contrapuntal composition, or the main ideas of a jazz standard as mathematical entities? In this work, we suggest a collection of novel models for the representation of music that are endowed with two main features. First, they originate from a topological and geometrical inspiration; second, their low dimensionality allows to build simple and informative visualisations. Here, we tackle the problem of music representation following three non-orthogonal directions. We suggest a formalisation of the concept of voice leading (the assignment of an instrument to each voice in a sequence of chords) suggesting a horizontal viewpoint on music, constituted by the simultaneous motions of superposed melodies. This formalisation naturally leads to the interpretation of counterpoint as a multivariate time series of partial permutation matrices, whose observations are characterised by a degree of complexity. After providing both a static and a dynamic representation of counterpoint, voice leadings are reinterpreted as a special class of partial singular braids (paths in the Euclidean space), and their main features are visualised as geometric configurations of collections of 3-dimensional strands. Thereafter, we neglect this time-related information, in order to reduce the problem to the study of vertical musical entities. The model we propose is derived from a topological interpretation of the Tonnetz (a graph commonly used in computational musicology) and the deformation of its vertices induced by a harmonic and a consonance-oriented function, respectively. The 3-dimensional shapes derived from these deformations are classified using the formalism of persistent homology. This powerful topological technique allows to compute a fingerprint of a shape, that reflects its persistent geometrical and topological properties. Furthermore, it is possible to compute a distance between these fingerprints and hence study their hierarchical organisation. This particular feature allows us to tackle the problem of automatic classification of music in an innovative way. Thus, this novel representation of music is evaluated on a collection of heterogenous musical datasets. Finally, a combination of the two aforementioned approaches is proposed. A model at the crossroad between the signal and symbolic analysis of music uses multiple sequences alignment to provide an encompassing, novel viewpoint on the musical inspiration transfer among compositions belonging to different artists, genres and time. To conclude, we shall represent music as a time series of topological fingerprints, whose metric nature allows to compare pairs of time-varying shapes in both topological and in musical terms. In particular the dissimilarity scores computed by aligning such sequences shall be applied both to the analysis and classification of music
    corecore