1,036 research outputs found

    High-resolution sinusoidal analysis for resolving harmonic collisions in music audio signal processing

    Get PDF
    Many music signals can largely be considered an additive combination of multiple sources, such as musical instruments or voice. If the musical sources are pitched instruments, the spectra they produce are predominantly harmonic, and are thus well suited to an additive sinusoidal model. However, due to resolution limits inherent in time-frequency analyses, when the harmonics of multiple sources occupy equivalent time-frequency regions, their individual properties are additively combined in the time-frequency representation of the mixed signal. Any such time-frequency point in a mixture where multiple harmonics overlap produces a single observation from which the contributions owed to each of the individual harmonics cannot be trivially deduced. These overlaps are referred to as overlapping partials or harmonic collisions. If one wishes to infer some information about individual sources in music mixtures, the information carried in regions where collided harmonics exist becomes unreliable due to interference from other sources. This interference has ramifications in a variety of music signal processing applications such as multiple fundamental frequency estimation, source separation, and instrumentation identification. This thesis addresses harmonic collisions in music signal processing applications. As a solution to the harmonic collision problem, a class of signal subspace-based high-resolution sinusoidal parameter estimators is explored. Specifically, the direct matrix pencil method, or equivalently, the Estimation of Signal Parameters via Rotational Invariance Techniques (ESPRIT) method, is used with the goal of producing estimates of the salient parameters of individual harmonics that occupy equivalent time-frequency regions. This estimation method is adapted here to be applicable to time-varying signals such as musical audio. While high-resolution methods have been previously explored in the context of music signal processing, previous work has not addressed whether or not such methods truly produce high-resolution sinusoidal parameter estimates in real-world music audio signals. Therefore, this thesis answers the question of whether high-resolution sinusoidal parameter estimators are really high-resolution for real music signals. This work directly explores the capabilities of this form of sinusoidal parameter estimation to resolve collided harmonics. The capabilities of this analysis method are also explored in the context of music signal processing applications. Potential benefits of high-resolution sinusoidal analysis are examined in experiments involving multiple fundamental frequency estimation and audio source separation. This work shows that there are indeed benefits to high-resolution sinusoidal analysis in music signal processing applications, especially when compared to methods that produce sinusoidal parameter estimates based on more traditional time-frequency representations. The benefits of this form of sinusoidal analysis are made most evident in multiple fundamental frequency estimation applications, where substantial performance gains are seen. High-resolution analysis in the context of computational auditory scene analysis-based source separation shows similar performance to existing comparable methods

    Object coding of music using expressive MIDI

    Get PDF
    PhDStructured audio uses a high level representation of a signal to produce audio output. When it was first introduced in 1998, creating a structured audio representation from an audio signal was beyond the state-of-the-art. Inspired by object coding and structured audio, we present a system to reproduce audio using Expressive MIDI, high-level parameters being used to represent pitch expression from an audio signal. This allows a low bit-rate MIDI sketch of the original audio to be produced. We examine optimisation techniques which may be suitable for inferring Expressive MIDI parameters from estimated pitch trajectories, considering the effect of data codings on the difficulty of optimisation. We look at some less common Gray codes and examine their effect on algorithm performance on standard test problems. We build an expressive MIDI system, estimating parameters from audio and synthesising output from those parameters. When the parameter estimation succeeds, we find that the system produces note pitch trajectories which match source audio to within 10 pitch cents. We consider the quality of the system in terms of both parameter estimation and the final output, finding that improvements to core components { audio segmentation and pitch estimation, both active research fields { would produce a better system. We examine the current state-of-the-art in pitch estimation, and find that some estimators produce high precision estimates but are prone to harmonic errors, whilst other estimators produce fewer harmonic errors but are less precise. Inspired by this, we produce a novel pitch estimator combining the output of existing estimators

    Feature Extraction for Music Information Retrieval

    Get PDF
    Copyright c © 2009 Jesper Højvang Jensen, except where otherwise stated

    Computational Tonality Estimation: Signal Processing and Hidden Markov Models

    Get PDF
    PhDThis thesis investigates computational musical tonality estimation from an audio signal. We present a hidden Markov model (HMM) in which relationships between chords and keys are expressed as probabilities of emitting observable chords from a hidden key sequence. The model is tested first using symbolic chord annotations as observations, and gives excellent global key recognition rates on a set of Beatles songs. The initial model is extended for audio input by using an existing chord recognition algorithm, which allows it to be tested on a much larger database. We show that a simple model of the upper partials in the signal improves percentage scores. We also present a variant of the HMM which has a continuous observation probability density, but show that the discrete version gives better performance. Then follows a detailed analysis of the effects on key estimation and computation time of changing the low level signal processing parameters. We find that much of the high frequency information can be omitted without loss of accuracy, and significant computational savings can be made by applying a threshold to the transform kernels. Results show that there is no single ideal set of parameters for all music, but that tuning the parameters can make a difference to accuracy. We discuss methods of evaluating more complex tonal changes than a single global key, and compare a metric that measures similarity to a ground truth to metrics that are rooted in music retrieval. We show that the two measures give different results, and so recommend that the choice of evaluation metric is determined by the intended application. Finally we draw together our conclusions and use them to suggest areas for continuation of this research, in the areas of tonality model development, feature extraction, evaluation methodology, and applications of computational tonality estimation.Engineering and Physical Sciences Research Council (EPSRC)

    Implementation and optimization of the synthesis of musical instrument tones using frequency modulation

    Get PDF
    Im Bereich der elektronischen Musik hat die Frequenzmodulation (FM) als eine effiziente Methode zur Klangsynthese in jüngster Zeit enorm an Bedeutung gewonnen. In der vorliegenden Arbeit werden Methoden zur Grundfrequenzschätzung und zur FM-Synthese für Musikinstrumentenklänge untersucht, bewertet und optimiert. Dazu wurde im Rahmen dieser Arbeit eine FM Analyse- und Syntheseumgebung entwickelt, in welcher die hier betrachteten Verfahren implementiert wurden. Zur Grundfrequenzschätzung in Musiksignalen wurde ein neuartiges Verfahren auf Basis von Harmonic Pattern Match (HPM) entwickelt, welches eine höhere Schätzungsgenauigkeit als bisher verwendete Verfahren bietet. Hierzu wird nach Festlegung einer geeigneten Teilmenge der Spektraldaten die Autokorrelation sowohl im Zeitals auch im Frequenzbereich analysiert, um Kandidaten für die Grundfrequenz des Signals zu bestimmen. Anschließend wird die Übereinstimmung jedes dieser Kandidaten mit dem Profil der Harmonischen des Musiksignals nach einem effizienten Verfahren analysiert. Das vorgeschlagene Verfahren wurde analysiert und im Kontext mit anderen Verfahren zur Grundfrequenzschätzung bewertet. Die praktische Anwendbarkeit des HPM Verfahrens konnte gezeigt werden. Zur Implementierung einer FM Synthese wird ein Verfahren zur Approximation eines Spektrums auf Basis Genetischer Algorithmen (GA) vorgestellt. Die Problemstellung des GA einschließlich eines Verfahrens zur Bestimmung optimaler FMParameter wird beschrieben. Des Weiteren wurden im Hinblick auf eine optimierte FM-Synthese die Anforderungen an das Trägersignal sowie an den Modulator untersucht, mit dem Ziel einer Vorab-Festlegung des Parameterraums für akkurate Syntheseresultate. Mit dem Ziel einer Datenreduktion bei der FM-Synthese wurde eine stückweise lineare Approximation der Einhüllenden des Trägersignals entwickelt. Einen weiteren Aspekt der Optimierung stellt die Verknüpfung von Formanten in der Matching-Prozedur dar, wobei die Harmonischen der Formanten mit entsprechenden Faktoren gewichtet werden. Auf diese Weise wird eine deutlich genauere Approximation des Timbres des zu synthetisierenden Klangs erreicht. Hierzu wurden die Schätzung der spektralen Einhüllenden und die Extraktion der Formanten analysiert und implementiert. Die im Rahmen dieser Arbeit entwickelte Testumgebung ermöglicht die Schätzung der Parameter und die Analyse und Bewertung der so erzeugten FM-Syntheseresultate.Frequency modulation (FM) as an efficient method to synthesize musical sounds is of great importance in the area of computer music. In this thesis, the estimation of fundamental frequency, the FM synthesis procedure of musical instrument tones and the optimization on FM synthesis were analysed, evaluated, improved and implemented. A FM analysis and synthesis environment was developed, in which the presented work in this thesis were implemented. For the estimation of fundamental frequency of music signals, an algorithm based on harmonic pattern match (HPM) was designed to achieve more reliable estimation accuracy. After defining the spectrum subset, the autocorrelation was applied on the spectrum subset to exploiting candidates of fundamental frequency, and an efficient mechanism to evaluate the match between each candidate and the harmonic pattern of the musical signal was designed. Evaluation of the proposed algorithm and several other estimation algorithms was performed. For the implementation of FM synthesis, the matching procedure of spectra using genetic algorithm (GA) was described, including the definition of the task in GA and the searching procedure of optimized FM parameters through GA. For the optimization on FM synthesis, the requirements of carrier and modulator were analysed and the parameter space was examined, based on which a method for the predetermination of parameter space was designed to achieve accurate synthesis results. For data reduction in FM synthesis, the piecewise linear approximation of the carrier amplitude envelope was designed. Further step on the FM synthesis optimization was implemented by the combination of formants in the spectra matching procedure, in which the formant harmonics were emphasized by the weighting coefficients to achieve more accurate timbre of the synthesized sounds. The spectral envelope estimation and the formant extraction were analysed and implemented. For the analysis and implementation of FM synthesis, a testing environment program was developed, offering the functionality of parameter estimation and performance evaluation in FM synthesis
    corecore