Search CORE

3 research outputs found

Intelligent Tools for Multitrack Frequency and Dynamics Processing

Author: Ma Zheng
Publication venue: 'Queen Mary University of London'
Publication date: 23/05/2017
Field of study

PhDThis research explores the possibility of reproducing mixing decisions of a skilled audio engineer with minimal human interaction that can improve the overall listening experience of musical mixtures, i.e., intelligent mixing. By producing a balanced mix automatically musician and mixing engineering can focus on their creativity while the productivity of music production is increased. We focus on the two essential aspects of such a system, frequency and dynamics. This thesis presents an intelligent strategy for multitrack frequency and dynamics processing that exploit the interdependence of input audio features, incorporates best practices in audio engineering, and driven by perceptual models and subjective criteria. The intelligent frequency processing research begins with a spectral characteristic analysis of commercial recordings, where we discover a consistent leaning towards a target equalization spectrum. A novel approach for automatically equalizing audio signals towards the observed target spectrum is then described and evaluated. We proceed to dynamics processing, and introduce an intelligent multitrack dynamic range compression algorithm, in which various audio features are proposed and validated to better describe the transient nature and spectral content of the signals. An experiment to investigate the human preference on dynamic processing is described to inform our choices of parameter automations. To provide a perceptual basis for the intelligent system, we evaluate existing perceptual models, and propose several masking metrics to quantify the masking behaviour within the multitrack mixture. Ultimately, we integrate previous research on auditory masking, frequency and dynamics processing, into one intelligent system of mix optimization that replicates the iterative process of human mixing. Within the system, we explore the relationship between equalization and dynamics processing, and propose a general frequency and dynamics processing framework. Various implementations of the intelligent system are explored and evaluated objectively and subjectively through listening experiments.China Scholarship Council

Queen Mary Research Online

Applications of loudness models in audio engineering

Author: Ward Dominic
Publication venue
Publication date: 17/08/2017
Field of study

This thesis investigates the application of perceptual models to areas of audio engineering, with a particular focus on music production. The goal was to establish efficient and practical tools for the measurement and control of the perceived loudness of musical sounds. Two types of loudness model were investigated: the single-band model and the multiband excitation pattern (EP) model. The heuristic single-band devices were designed to be simple but sufficiently effective for real-world application, whereas the multiband procedures were developed to give a reasonable account of a large body of psychoacoustic findings according to a functional model of the peripheral hearing system. The research addresses the extent to which current models of loudness generalise to musical instruments, and whether can they be successfully employed in music applications. The domain-specific disparity between the two types of model was first tackled by reducing the computational load of state-of-the-art EP models to allow for fast but low-error auditory signal processing. Two elaborate hearing models were analysed and optimised using musical instruments and speech as test stimuli. It was shown that, after significantly reducing the complexity of both procedures, estimates of global loudness, such as peak loudness, as well as the intermediate auditory representations can be preserved with high accuracy. Based on the optimisations, two real-time applications were developed: a binaural loudness meter and an automatic multitrack mixer. This second system was designed to work independently of the loudness measurement procedure, and therefore supports both linear and nonlinear models. This allowed for a single mixing device to be assessed using different loudness metrics and this was demonstrated by evaluating three configurations through subjective assessment. Unexpectedly, when asked to rate both the overall quality of a mix and the degree to which instruments were equally loud, listeners preferred mixes generated using heuristic single-band models over those produced using a multiband procedure. A series of more systematic listening tests were conducted to further investigate this finding. Subjective loudness matches of musical instruments commonly found in western popular music were collected to evaluate the performance of five published models. The results were in accord with the application-based assessment, namely that current EP procedures do not generalise well when estimating the relative loudness of musical sounds which have marked differences in spectral content. Model specific issues were identified relating to the calculation of spectral loudness summation (SLS) and the method used to determine the global-loudness percept of time-varying musical sounds; associated refinements were proposed. It was shown that a new multiband loudness model with a heuristic loudness transformation yields superior performance over existing methods. This supports the idea that a revised model of SLS is needed, and therefore that modification to this stage in existing psychoacoustic procedures is an essential step towards the goal of achieving real-world deployment

Birmingham City University Open Access Repository

BCU Open Access