19 research outputs found
The development of corpus-based computer assisted composition program and its application for instrumental music composition
In the last 20 years, we have seen the nourishing environment for the development of
music software using a corpus of audio data expanding significantly, namely that synthesis
techniques producing electronic sounds, and supportive tools for creative activities
are the driving forces to the growth. Some software produces a sequence of sounds by
means of synthesizing a chunk of source audio data retrieved from an audio database
according to a rule. Since the matching of sources is processed according to their descriptive
features extracted by FFT analysis, the quality of the result is significantly
influenced by the outcomes of the Audio Analysis, Segmentation, and Decomposition.
Also, the synthesis process often requires a considerable amount of sample data and
this can become an obstacle to establish easy, inexpensive, and user-friendly applications
on various kinds of devices. Therefore, it is crucial to consider how to treat the
data and construct an efficient database for the synthesis. We aim to apply corpusbased
synthesis techniques to develop a Computer Assisted Composition program, and
to investigate the actual application of the program on ensemble pieces. The goal of
this research is to apply the program to the instrumental music composition, refine its
function, and search new avenues for innovative compositional method
Shift-Invariant Kernel Additive Modelling for Audio Source Separation
A major goal in blind source separation to identify and separate sources is
to model their inherent characteristics. While most state-of-the-art approaches
are supervised methods trained on large datasets, interest in non-data-driven
approaches such as Kernel Additive Modelling (KAM) remains high due to their
interpretability and adaptability. KAM performs the separation of a given
source applying robust statistics on the time-frequency bins selected by a
source-specific kernel function, commonly the K-NN function. This choice
assumes that the source of interest repeats in both time and frequency. In
practice, this assumption does not always hold. Therefore, we introduce a
shift-invariant kernel function capable of identifying similar spectral content
even under frequency shifts. This way, we can considerably increase the amount
of suitable sound material available to the robust statistics. While this leads
to an increase in separation performance, a basic formulation, however, is
computationally expensive. Therefore, we additionally present acceleration
techniques that lower the overall computational complexity.Comment: Feedback is welcom
Recommended from our members
Multiple-instrument polyphonic music transcription using a convolutive probabilistic model
(Abstract to follow
Research on Effective Designs and Evaluation for Speech Interface Systems
制度:新 ; 報告番号:乙2305号 ; 学位の種類:博士(工学) ; 授与年月日:2011/2/25 ; 早大学位記番号:新564
Polyphonic music transcription using note onset and offset detection
In this paper, an approach for polyphonic music transcription based on joint multiple-F0 estimation and note onset/offset detection is proposed. For preprocessing, the resonator time-frequency image of the input music signal is extracted and noise suppression is performed. A pitch salience function is extracted for each frame along with tuning and inharmonicity parameters. For onset detection, late fusion is employed by combining a novel spectral flux-based feature which incorporates pitch tuning information and a novel salience function-based descriptor. For each segment defined by two onsets, an overlapping partial treatment procedure is used and a pitch set score function is proposed. A note offset detection procedure is also proposed using HMMs trained on MIDI data. The system was trained on piano chords and tested on classic and jazz recordings from the RWC database. Improved transcription results are reported compared to state-of-the-art approaches
Music Information Retrieval Meets Music Education
This paper addresses the use of Music Information Retrieval (MIR) techniques in music education and their integration in learning software. A general overview of systems that are either commercially available or in research stage is presented. Furthermore, three well-known MIR methods used in music learning systems and their state-of-the-art are described: music transcription, solo and accompaniment track creation, and generation of performance instructions. As a representative example of a music learning system developed within the MIR community, the Songs2See software is outlined. Finally, challenges and directions for future research are described
Recommended from our members
A Shift-Invariant Latent Variable Model for Automatic Music Transcription
In this work, a probabilistic model for multiple-instrument automatic music transcription is proposed. The model extends the shift-invariant probabilistic latent component analysis method, which is used for spectrogram factorization. Proposed extensions support the use of multiple spectral templates per pitch and per instrument source, as well as a time-varying pitch contribution for each source. Thus, this method can effectively be used for multiple-instrument automatic transcription. In addition, the shift-invariant aspect of the method can be exploited for detecting tuning changes and frequency modulations, as well as for visualizing pitch content. For note tracking and smoothing, pitch-wise hidden Markov models are used. For training, pitch templates from eight orchestral instruments were extracted, covering their complete note range. The transcription system was tested on multiple-instrument polyphonic recordings from the RWC database, a Disklavier data set, and the MIREX 2007 multi-F0 data set. Results demonstrate that the proposed method outperforms leading approaches from the transcription literature, using several error metrics
Joint Multi-Pitch Detection Using Harmonic Envelope Estimation for Polyphonic Music Transcription
In this paper, a method for automatic transcription of music signals based on joint multiple-F0 estimation is proposed. As a time-frequency representation, the constant-Q resonator time-frequency image is employed, while a novel noise suppression technique based on pink noise assumption is applied in a preprocessing step. In the multiple-F0 estimation stage, the optimal tuning and inharmonicity parameters are computed and a salience function is proposed in order to select pitch candidates. For each pitch candidate combination, an overlapping partial treatment procedure is used, which is based on a novel spectral envelope estimation procedure for the log-frequency domain, in order to compute the harmonic envelope of candidate pitches. In order to select the optimal pitch combination for each time frame, a score function is proposed which combines spectral and temporal characteristics of the candidate pitches and also aims to suppress harmonic errors. For postprocessing, hidden Markov models (HMMs) and conditional random fields (CRFs) trained on MIDI data are employed, in order to boost transcription accuracy. The system was trained on isolated piano sounds from the MAPS database and was tested on classic and jazz recordings from the RWC database, as well as on recordings from a Disklavier piano. A comparison with several state-of-the-art systems is provided using a variety of error metrics, where encouraging results are indicated
Combined audio and video analysis for guitar chord identification
This thesis presents a multi-modal approach to automatically identifying guitar chords using audio and video of the performer. Chord identi cation is typically performed by analyzing the audio, using a chroma based feature to extract pitch class information, then identifying the chord with the appropriate label. Even if this method proves perfectly accurate, stringed instruments add extra ambiguity as a single chord or melody may be played in di erent positions on the fretboard. Preserving this information is important, because it signi es the original ngering, and implied \easiest" way to perform the selection. This chord identi cation system combines analysis of audio to determine the general chord scale (i.e. A major, G minor), and video of the guitarist to determine chord voicing (i.e. open, barred, inversion), to accurately identify the guitar chord.M.S., Electrical Engineering -- Drexel University, 201
Automatic transcription of polyphonic music exploiting temporal evolution
PhDAutomatic music transcription is the process of converting an audio recording
into a symbolic representation using musical notation. It has numerous applications
in music information retrieval, computational musicology, and the
creation of interactive systems. Even for expert musicians, transcribing polyphonic
pieces of music is not a trivial task, and while the problem of automatic
pitch estimation for monophonic signals is considered to be solved, the creation
of an automated system able to transcribe polyphonic music without setting
restrictions on the degree of polyphony and the instrument type still remains
open.
In this thesis, research on automatic transcription is performed by explicitly
incorporating information on the temporal evolution of sounds. First efforts address
the problem by focusing on signal processing techniques and by proposing
audio features utilising temporal characteristics. Techniques for note onset and
offset detection are also utilised for improving transcription performance. Subsequent
approaches propose transcription models based on shift-invariant probabilistic
latent component analysis (SI-PLCA), modeling the temporal evolution
of notes in a multiple-instrument case and supporting frequency modulations in
produced notes. Datasets and annotations for transcription research have also
been created during this work. Proposed systems have been privately as well as
publicly evaluated within the Music Information Retrieval Evaluation eXchange
(MIREX) framework. Proposed systems have been shown to outperform several
state-of-the-art transcription approaches.
Developed techniques have also been employed for other tasks related to music
technology, such as for key modulation detection, temperament estimation,
and automatic piano tutoring. Finally, proposed music transcription models
have also been utilized in a wider context, namely for modeling acoustic scenes