804 research outputs found

    DDSP-Piano: A Neural Sound Synthesizer Informed by Instrument Knowledge

    Get PDF
    Instrument sound synthesis using deep neural networks has received numerous improvements over the last couple of years. Among them, the Differentiable Digital Signal Processing (DDSP) framework has modernized the spectral modeling paradigm by including signal-based synthesizers and effects into fully differentiable architectures. The present work extends the applications of DDSP to the task of polyphonic sound synthesis, with the proposal of a differentiable piano synthesizer conditioned on MIDI inputs. The model architecture is motivated by high-level acoustic modeling knowledge of the instrument, which, along with the sound structure priors inherent to the DDSP components, makes for a lightweight, interpretable, and realistic-sounding piano model. A subjective listening test has revealed that the proposed approach achieves better sound quality than a state-of-the-art neural-based piano synthesizer, but physical-modeling-based models still hold the best quality. Leveraging its interpretability and modularity, a qualitative analysis of the model behavior was also conducted: it highlights where additional modeling knowledge and optimization procedures could be inserted in order to improve the synthesis quality and the manipulation of sound properties. Eventually, the proposed differentiable synthesizer can be further used with other deep learning models for alternative musical tasks handling polyphonic audio and symbolic data

    A computational framework for aesthetical navigation in musical search space

    Get PDF
    Paper presented at 3rd AISB symposium on computational creativity, AISB 2016, 4-6th April, Sheffield. Abstract. This article addresses aspects of an ongoing project in the generation of artificial Persian (-like) music. Liquid Persian Music software (LPM) is a cellular automata based audio generator. In this paper LPM is discussed from the view point of future potentials of algorithmic composition and creativity. Liquid Persian Music is a creative tool, enabling exploration of emergent audio through new dimensions of music composition. Various configurations of the system produce different voices which resemble musical motives in many respects. Aesthetical measurements are determined by Zipf’s law in an evolutionary environment. Arranging these voices together for producing a musical corpus can be considered as a search problem in the LPM outputs space of musical possibilities. On this account, the issues toward defining the search space for LPM is studied throughout this paper

    Foley Music: Learning to Generate Music from Videos

    Full text link
    In this paper, we introduce Foley Music, a system that can synthesize plausible music for a silent video clip about people playing musical instruments. We first identify two key intermediate representations for a successful video to music generator: body keypoints from videos and MIDI events from audio recordings. We then formulate music generation from videos as a motion-to-MIDI translation problem. We present a Graph−-Transformer framework that can accurately predict MIDI event sequences in accordance with the body movements. The MIDI event can then be converted to realistic music using an off-the-shelf music synthesizer tool. We demonstrate the effectiveness of our models on videos containing a variety of music performances. Experimental results show that our model outperforms several existing systems in generating music that is pleasant to listen to. More importantly, the MIDI representations are fully interpretable and transparent, thus enabling us to perform music editing flexibly. We encourage the readers to watch the demo video with audio turned on to experience the results.Comment: ECCV 2020. Project page: http://foley-music.csail.mit.ed

    Triple Synthesis

    Get PDF
    This thesis investigates the result of merging three musical approaches (jazz fusion, breakbeat/IDM and Electronic Dance Music) and their respective methodologies as applied to music composition. It is presented in a progressive manner. Chapters two to four identify and discuss each of the three styles separately in terms of the research undertaken in the preparation of this thesis. Chapter 2 discusses, through a close examination of selected compositions and recordings, both Weather Report and Herbie Hancock as representing source material for research and compositional study in terms of melody, harmony and orchestration from the 1970s jazz-fusion genre. Chapter 3 examines breakbeat and Intelligent Dance Music (IDM) drum rhythm programming through both technique and musical application. Chapter 4 presents an examination of selected contemporary Electronic Dance Music (EDM) techniques and discusses their importance in current electronic music styles. Chapters 5, 6 and 7 each present an original composition based on the application and synthesis of the styles and techniques explored in the previous three chapters, with each composition defined by proportions of influence from each of the three styles as in the Venn diagram shown in the introduction. Since the musical context of the original compositions is software oriented, diagrams and computer screenshots are used in addition to conventional score notation in order to highlight details of musical examples and techniques. The final chapter discusses the conclusions made through the thesis research and result of this “synthesis” style of composition

    Perception of interactive vibrotactile cues on the acoustic grand and upright piano

    Get PDF
    An experiment has been conducted, measuring pianists’ sensitivity to piano key vibrations at the fingers while playing an upright or a grand Yamaha Disklavier piano. At each trial, which consisted in playing loud and long A notes across the whole keyboard, vibrations were either present or absent through setting the Disklavier pianos to normal or quiet mode. Sound feedback was always provided by a MIDI controlled piano synthesizer via isolating ear/headphones, which masked the acoustic sound in normal mode. In partial disagreement with the existing literature, our results suggest that significant vibrotactile cues are produced in the lower range of the piano keyboard, with perceptual cut-off around the middle octave. Possible psychophysical mechanisms supporting the existence of such cues are additionally discussed
    • 

    corecore