110 research outputs found

    Extracting expressive performance information from recorded music

    Get PDF
    Thesis (M.S.)--Massachusetts Institute of Technology, Program in Media Arts & Sciences, 1995.Includes bibliographical references (leaves 55-56).by Eric David Scheirer.M.S

    Automatic Music Transcription as We Know it Today

    Full text link

    Audio source separation for music in low-latency and high-latency scenarios

    Get PDF
    Aquesta tesi proposa mètodes per tractar les limitacions de les tècniques existents de separació de fonts musicals en condicions de baixa i alta latència. En primer lloc, ens centrem en els mètodes amb un baix cost computacional i baixa latència. Proposem l'ús de la regularització de Tikhonov com a mètode de descomposició de l'espectre en el context de baixa latència. El comparem amb les tècniques existents en tasques d'estimació i seguiment dels tons, que són passos crucials en molts mètodes de separació. A continuació utilitzem i avaluem el mètode de descomposició de l'espectre en tasques de separació de veu cantada, baix i percussió. En segon lloc, proposem diversos mètodes d'alta latència que milloren la separació de la veu cantada, gràcies al modelatge de components específics, com la respiració i les consonants. Finalment, explorem l'ús de correlacions temporals i anotacions manuals per millorar la separació dels instruments de percussió i dels senyals musicals polifònics complexes.Esta tesis propone métodos para tratar las limitaciones de las técnicas existentes de separación de fuentes musicales en condiciones de baja y alta latencia. En primer lugar, nos centramos en los métodos con un bajo coste computacional y baja latencia. Proponemos el uso de la regularización de Tikhonov como método de descomposición del espectro en el contexto de baja latencia. Lo comparamos con las técnicas existentes en tareas de estimación y seguimiento de los tonos, que son pasos cruciales en muchos métodos de separación. A continuación utilizamos y evaluamos el método de descomposición del espectro en tareas de separación de voz cantada, bajo y percusión. En segundo lugar, proponemos varios métodos de alta latencia que mejoran la separación de la voz cantada, gracias al modelado de componentes que a menudo no se toman en cuenta, como la respiración y las consonantes. Finalmente, exploramos el uso de correlaciones temporales y anotaciones manuales para mejorar la separación de los instrumentos de percusión y señales musicales polifónicas complejas.This thesis proposes specific methods to address the limitations of current music source separation methods in low-latency and high-latency scenarios. First, we focus on methods with low computational cost and low latency. We propose the use of Tikhonov regularization as a method for spectrum decomposition in the low-latency context. We compare it to existing techniques in pitch estimation and tracking tasks, crucial steps in many separation methods. We then use the proposed spectrum decomposition method in low-latency separation tasks targeting singing voice, bass and drums. Second, we propose several high-latency methods that improve the separation of singing voice by modeling components that are often not accounted for, such as breathiness and consonants. Finally, we explore using temporal correlations and human annotations to enhance the separation of drums and complex polyphonic music signals

    An review of automatic drum transcription

    Get PDF
    In Western popular music, drums and percussion are an important means to emphasize and shape the rhythm, often defining the musical style. If computers were able to analyze the drum part in recorded music, it would enable a variety of rhythm-related music processing tasks. Especially the detection and classification of drum sound events by computational methods is considered to be an important and challenging research problem in the broader field of Music Information Retrieval. Over the last two decades, several authors have attempted to tackle this problem under the umbrella term Automatic Drum Transcription(ADT).This paper presents a comprehensive review of ADT research, including a thorough discussion of the task-specific challenges, categorization of existing techniques, and evaluation of several state-of-the-art systems. To provide more insights on the practice of ADT systems, we focus on two families of ADT techniques, namely methods based on Nonnegative Matrix Factorization and Recurrent Neural Networks. We explain the methods’ technical details and drum-specific variations and evaluate these approaches on publicly available datasets with a consistent experimental setup. Finally, the open issues and under-explored areas in ADT research are identified and discussed, providing future directions in this fiel

    Triple Synthesis

    Get PDF
    This thesis investigates the result of merging three musical approaches (jazz fusion, breakbeat/IDM and Electronic Dance Music) and their respective methodologies as applied to music composition. It is presented in a progressive manner. Chapters two to four identify and discuss each of the three styles separately in terms of the research undertaken in the preparation of this thesis. Chapter 2 discusses, through a close examination of selected compositions and recordings, both Weather Report and Herbie Hancock as representing source material for research and compositional study in terms of melody, harmony and orchestration from the 1970s jazz-fusion genre. Chapter 3 examines breakbeat and Intelligent Dance Music (IDM) drum rhythm programming through both technique and musical application. Chapter 4 presents an examination of selected contemporary Electronic Dance Music (EDM) techniques and discusses their importance in current electronic music styles. Chapters 5, 6 and 7 each present an original composition based on the application and synthesis of the styles and techniques explored in the previous three chapters, with each composition defined by proportions of influence from each of the three styles as in the Venn diagram shown in the introduction. Since the musical context of the original compositions is software oriented, diagrams and computer screenshots are used in addition to conventional score notation in order to highlight details of musical examples and techniques. The final chapter discusses the conclusions made through the thesis research and result of this “synthesis” style of composition

    Towards the automated analysis of simple polyphonic music : a knowledge-based approach

    Get PDF
    PhDMusic understanding is a process closely related to the knowledge and experience of the listener. The amount of knowledge required is relative to the complexity of the task in hand. This dissertation is concerned with the problem of automatically decomposing musical signals into a score-like representation. It proposes that, as with humans, an automatic system requires knowledge about the signal and its expected behaviour to correctly analyse music. The proposed system uses the blackboard architecture to combine the use of knowledge with data provided by the bottom-up processing of the signal's information. Methods are proposed for the estimation of pitches, onset times and durations of notes in simple polyphonic music. A method for onset detection is presented. It provides an alternative to conventional energy-based algorithms by using phase information. Statistical analysis is used to create a detection function that evaluates the expected behaviour of the signal regarding onsets. Two methods for multi-pitch estimation are introduced. The first concentrates on the grouping of harmonic information in the frequency-domain. Its performance and limitations emphasise the case for the use of high-level knowledge. This knowledge, in the form of the individual waveforms of a single instrument, is used in the second proposed approach. The method is based on a time-domain linear additive model and it presents an alternative to common frequency-domain approaches. Results are presented and discussed for all methods, showing that, if reliably generated, the use of knowledge can significantly improve the quality of the analysis.Joint Information Systems Committee (JISC) in the UK National Science Foundation (N.S.F.) in the United states. Fundacion Gran Mariscal Ayacucho in Venezuela

    Electronics, music and computers

    Get PDF
    technical reportElectronic and computer technology has had and will continue to have a marked effect in the field of music. Through the years scientists, engineers, and musicians have applied available technology to new musical instruments, innovative musical sound production, sound analysis, and musicology. At the University of Utah we have designed and are implementing a communication network involving and electronic organ and a small computer to provide a tool to be used in music performance, the learning of music theory, the investigation of music notation, the composition of music, the perception of music, and the printing of music

    Automated Rhythmic Transformation of Drum Recordings

    Get PDF
    Within the creative industries, music information retrieval techniques are now being applied in a variety of music creation and production applications. Audio artists incorporate techniques from music informatics and machine learning (e.g., beat and metre detection) for generative content creation and manipulation systems within the music production setting. Here musicians, desiring a certain sound or aesthetic influenced by the style of artists they admire, may change or replace the rhythmic pattern and sound characteristics (i.e., timbre) of drums in their recordings with those from an idealised recording (e.g., in processes of redrumming and mashup creation). Automated transformation systems for rhythm and timbre can be powerful tools for music producers, allowing them to quickly and easily adjust the different elements of a drum recording to fit the overall style of a song. The aim of this thesis is to develop systems for automated transformation of rhythmic patterns of drum recordings using a subset of techniques from deep learning called deep generative models (DGM) for neural audio synthesis. DGMs such as autoencoders and generative adversarial networks have been shown to be effective for transforming musical signals in a variety of genres as well as for learning the underlying structure of datasets for generation of new audio examples. To this end, modular deep learning-based systems are presented in this thesis with evaluations which measure the extent of the rhythmic modifications generated by different modes of transformation, which include audio style transfer, drum translation and latent space manipulation. The evaluation results underscore both the strengths and constraints of DGMs for transformation of rhythmic patterns as well as neural synthesis of drum sounds within a variety of musical genres. New audio style transfer (AST) functions were specifically designed for mashup-oriented drum recording transformation. The designed loss objectives lowered the computational demands of the AST algorithm and offered rhythmic transformation capabilities which adhere to a larger rhythmic structure of the input to generate music that is both creative and realistic. To extend the transformation possibilities of DGMs, systems based on adversarial autoencoders (AAE) were proposed for drum translation and continuous rhythmic transformation of bar-length patterns. The evaluations which investigated the lower dimensional representations of the latent space of the proposed system based on AAEs with a Gaussian mixture prior (AAE-GM) highlighted the importance of the structure of the disentangled latent distributions of AAE-GM. Furthermore, the proposed system demonstrated improved performance, as evidenced by higher reconstruction metrics, when compared to traditional autoencoder models. This implies that the system can more accurately recreate complex drum sounds, ensuring that the produced rhythmic transformation maintains richness of the source material. For music producers, this means heightened fidelity in drum synthesis and the potential for more expressive and varied drum tracks, enhancing the creativity in music production. This work also enhances neural drum synthesis by introducing a new, diverse dataset of kick, snare, and hi-hat drum samples, along with multiple drum loop datasets for model training and evaluation. Overall, the work in this thesis increased the profile of the field and hopefully will attract more attention and resources to the area, which will help drive future research and development of neural rhythmic transformation systems
    corecore