3,167 research outputs found
Methodological considerations concerning manual annotation of musical audio in function of algorithm development
In research on musical audio-mining, annotated music databases are needed which allow the development of computational tools that extract from the musical audiostream the kind of high-level content that users can deal with in Music Information Retrieval (MIR) contexts. The notion of musical content, and therefore the notion of annotation, is ill-defined, however, both in the syntactic and semantic sense. As a consequence, annotation has been approached from a variety of perspectives (but mainly linguistic-symbolic oriented), and a general methodology is lacking. This paper is a step towards the definition of a general framework for manual annotation of musical audio in function of a computational approach to musical audio-mining that is based on algorithms that learn from annotated data. 1
JamBot: Music Theory Aware Chord Based Generation of Polyphonic Music with LSTMs
We propose a novel approach for the generation of polyphonic music based on
LSTMs. We generate music in two steps. First, a chord LSTM predicts a chord
progression based on a chord embedding. A second LSTM then generates polyphonic
music from the predicted chord progression. The generated music sounds pleasing
and harmonic, with only few dissonant notes. It has clear long-term structure
that is similar to what a musician would play during a jam session. We show
that our approach is sensible from a music theory perspective by evaluating the
learned chord embeddings. Surprisingly, our simple model managed to extract the
circle of fifths, an important tool in music theory, from the dataset.Comment: Paper presented at the 29th International Conference on Tools with
Artificial Intelligence, ICTAI 2017, Boston, MA, US
MelodyGLM: Multi-task Pre-training for Symbolic Melody Generation
Pre-trained language models have achieved impressive results in various music
understanding and generation tasks. However, existing pre-training methods for
symbolic melody generation struggle to capture multi-scale, multi-dimensional
structural information in note sequences, due to the domain knowledge
discrepancy between text and music. Moreover, the lack of available large-scale
symbolic melody datasets limits the pre-training improvement. In this paper, we
propose MelodyGLM, a multi-task pre-training framework for generating melodies
with long-term structure. We design the melodic n-gram and long span sampling
strategies to create local and global blank infilling tasks for modeling the
local and global structures in melodies. Specifically, we incorporate pitch
n-grams, rhythm n-grams, and their combined n-grams into the melodic n-gram
blank infilling tasks for modeling the multi-dimensional structures in
melodies. To this end, we have constructed a large-scale symbolic melody
dataset, MelodyNet, containing more than 0.4 million melody pieces. MelodyNet
is utilized for large-scale pre-training and domain-specific n-gram lexicon
construction. Both subjective and objective evaluations demonstrate that
MelodyGLM surpasses the standard and previous pre-training methods. In
particular, subjective evaluations show that, on the melody continuation task,
MelodyGLM gains average improvements of 0.82, 0.87, 0.78, and 0.94 in
consistency, rhythmicity, structure, and overall quality, respectively.
Notably, MelodyGLM nearly matches the quality of human-composed melodies on the
melody inpainting task
A fuzzy rule model for high level musical features on automated composition systems
Algorithmic composition systems are now well-understood. However, when they are used for specific tasks like creating material for a part of a piece, it is common to prefer, from all of its possible outputs, those exhibiting specific properties. Even though the number of valid outputs is huge, many times the selection is performed manually, either using expertise in the algorithmic model, by means of sampling techniques, or some times even by chance. Automations of this process have been done traditionally by using machine learning techniques. However, whether or not these techniques are really capable of capturing the human rationality, through which the selection is done, to a great degree remains as an open question. The present work discusses a possible approach, that combines expert’s opinion and a fuzzy methodology for rule extraction, to model high level features. An early implementation able to explore the universe of outputs of a particular algorithm by means of the extracted rules is discussed. The rules search for objects similar to those having a desired and pre-identified feature. In this sense, the model can be seen as a finder of objects with specific properties.Peer ReviewedPostprint (author's final draft
Automatic accompaniment of vocal melodies in the context of popular music
A piece of popular music is usually defined as a combination of vocal melody and instrumental accompaniment. People often start with the melody part when they are trying to compose or reproduce a piece of popular music. However, creating appropriate instrumental accompaniment part for a melody line can be a difficult task for non-musicians. Automation of accompaniment generation for vocal melodies thus can be very useful for those who are interested in singing for fun. Therefore, a computer software system which is capable of generating harmonic accompaniment for a given vocal melody input has been presented in this thesis. This automatic accompaniment system uses a Hidden Markov Model to assign chord to a given part of melody based on the knowledge learnt from a bank of vocal tracks of popular music. Comparing with other similar systems, our system features a high resolution key estimation algorithm which is helpful to adjust the generated accompaniment to the input vocal. Moreover, we designed a structure analysis subsystem to extract the repetition and structure boundaries from the melody. These boundaries are passed to the chord assignment and style player subsystems in order to generate more dynamic and organized accompaniment. Finally, prototype applications are discussed and the entire system is evaluated.M.S.Committee Chair: Chordia, Parag; Committee Member: Freeman, Jason; Committee Member: Weinberg, Gi
Music in the first days of life
In adults, specific neural systems with right-hemispheric weighting are necessary to process pitch, melody and harmony, as well as structure and meaning emerging from musical sequences. To which extent does this neural specialization result from exposure to music or from neurobiological predispositions? We used fMRI to measure brain activity in 1 to 3 days old newborns while listening to Western tonal music, and to the same excerpts altered, so as to include tonal violations or dissonance. Music caused predominant right hemisphere activations in primary and higher-order auditory cortex. For altered music, activations were seen in the left inferior frontal cortex and limbic structures. Thus, the newborn's brain is able to plenty receive music and to figure out even small perceptual and structural differences in the music sequences. This neural architecture present at birth provides us the potential to process basic and complex aspects of music, a uniquely human capacity
- …