73 research outputs found

    Applications of dynamic diffuse signal processing in sound reinforcement and reproduction.

    Get PDF
    Electroacoustic systems are subject to position-dependent frequency responses due to coherent interference between multiple sources and/or early reflections. Diffuse signal processing (DiSP) provides a mechanism for signal decorrelation to potentially alleviate this well-known issue in sound reinforcement and reproduction applications. Previous testing has indicated that DiSP provides reduced low-frequency spatial variance across wide audience areas, but in closed acoustic spaces is less effective due to coherent early reflections. In this paper, dynamic implementation of DiSP is examined, whereby the decorrelation algorithm varies over time, thus allowing for decorrelation between surface reflections and direct sounds. Potential applications of dynamic DiSP are explored in the context of sound reinforcement (subwoofers, stage monitoring) and sound reproduction (small-room low-frequency control, loudspeaker crossovers), with preliminary experimental results presented.N/

    Efficient Algorithms for Immersive Audio Rendering Enhancement

    Get PDF
    Il rendering audio immersivo è il processo di creazione di un’esperienza sonora coinvolgente e realistica nello spazio 3D. Nei sistemi audio immersivi, le funzioni di trasferimento relative alla testa (head-related transfer functions, HRTFs) vengono utilizzate per la sintesi binaurale in cuffia poiché esprimono il modo in cui gli esseri umani localizzano una sorgente sonora. Possono essere introdotti algoritmi di interpolazione delle HRTF per ridurre il numero di punti di misura e per creare un movimento del suono affidabile. La riproduzione binaurale può essere eseguita anche dagli altoparlanti. Tuttavia, il coinvolgimento di due o più gli altoparlanti causa il problema del crosstalk. In questo caso, algoritmi di cancellazione del crosstalk (CTC) sono necessari per eliminare i segnali di interferenza indesiderati. In questa tesi, partendo da un'analisi comparativa di metodi di misura delle HRTF, viene proposto un sistema di rendering binaurale basato sull'interpolazione delle HRTF per applicazioni in tempo reale. Il metodo proposto mostra buone prestazioni rispetto a una tecnica di riferimento. L'algoritmo di interpolazione è anche applicato al rendering audio immersivo tramite altoparlanti, aggiungendo un algoritmo di cancellazione del crosstalk fisso, che considera l'ascoltatore in una posizione fissa. Inoltre, un sistema di cancellazione crosstalk adattivo, che include il tracciamento della testa dell'ascoltatore, è analizzato e implementato in tempo reale. Il CTC adattivo implementa una struttura in sottobande e risultati sperimentali dimostrano che un maggiore numero di bande migliora le prestazioni in termini di errore totale e tasso di convergenza. Il sistema di riproduzione e le caratteristiche dell'ambiente di ascolto possono influenzare le prestazioni a causa della loro risposta in frequenza non ideale. L'equalizzazione viene utilizzata per livellare le varie parti dello spettro di frequenze che compongono un segnale audio al fine di ottenere le caratteristiche sonore desiderate. L'equalizzazione può essere manuale, come nel caso dell'equalizzazione grafica, dove il guadagno di ogni banda di frequenza può essere modificato dall'utente, o automatica, la curva di equalizzazione è calcolata automaticamente dopo la misurazione della risposta impulsiva della stanza. L'equalizzazione della risposta ambientale può essere applicata anche ai sistemi multicanale, che utilizzano due o più altoparlanti e la zona di equalizzazione può essere ampliata misurando le risposte impulsive in diversi punti della zona di ascolto. In questa tesi, GEQ efficienti e un sistema adattativo di equalizzazione d'ambiente. In particolare, sono proposti e approfonditi tre equalizzatori grafici a basso costo computazionale e a fase lineare e quasi lineare. Gli esperimenti confermano l'efficacia degli equalizzatori proposti in termini di accuratezza, complessità computazionale e latenza. Successivamente, una struttura adattativa in sottobande è introdotta per lo sviluppo di un sistema di equalizzazione d'ambiente multicanale. I risultati sperimentali verificano l'efficienza dell'approccio in sottobande rispetto al caso a banda singola. Infine, viene presentata una rete crossover a fase lineare per sistemi multicanale, mostrando ottimi risultati in termini di risposta in ampiezza, bande di transizione, risposta polare e risposta in fase. I sistemi di controllo attivo del rumore (ANC) possono essere progettati per ridurre gli effetti dell'inquinamento acustico e possono essere utilizzati contemporaneamente a un sistema audio immersivo. L'ANC funziona creando un'onda sonora in opposizione di fase rispetto all'onda sonora in arrivo. Il livello sonoro complessivo viene così ridotto grazie all'interferenza distruttiva. Infine, questa tesi presenta un sistema ANC utilizzato per la riduzione del rumore. L’approccio proposto implementa una stima online del percorso secondario e si basa su filtri adattativi in sottobande applicati alla stima del percorso primario che mirano a migliorare le prestazioni dell’intero sistema. La struttura proposta garantisce un tasso di convergenza migliore rispetto all'algoritmo di riferimento.Immersive audio rendering is the process of creating an engaging and realistic sound experience in 3D space. In immersive audio systems, the head-related transfer functions (HRTFs) are used for binaural synthesis over headphones since they express how humans localize a sound source. HRTF interpolation algorithms can be introduced for reducing the number of measurement points and creating a reliable sound movement. Binaural reproduction can be also performed by loudspeakers. However, the involvement of two or more loudspeakers causes the problem of crosstalk. In this case, crosstalk cancellation (CTC) algorithms are needed to delete unwanted interference signals. In this thesis, starting from a comparative analysis of HRTF measurement techniques, a binaural rendering system based on HRTF interpolation is proposed and evaluated for real-time applications. The proposed method shows good performance in comparison with a reference technique. The interpolation algorithm is also applied for immersive audio rendering over loudspeakers, by adding a fixed crosstalk cancellation algorithm, which assumes that the listener is in a fixed position. In addition, an adaptive crosstalk cancellation system, which includes the tracking of the listener's head, is analyzed and a real-time implementation is presented. The adaptive CTC implements a subband structure and experimental results prove that a higher number of bands improves the performance in terms of total error and convergence rate. The reproduction system and the characteristics of the listening room may affect the performance due to their non-ideal frequency response. Audio equalization is used to adjust the balance of different audio frequencies in order to achieve desired sound characteristics. The equalization can be manual, such as in the case of graphic equalization, where the gain of each frequency band can be modified by the user, or automatic, where the equalization curve is automatically calculated after the room impulse response measurement. The room response equalization can be also applied to multichannel systems, which employ two or more loudspeakers, and the equalization zone can be enlarged by measuring the impulse responses in different points of the listening zone. In this thesis, efficient graphic equalizers (GEQs), and an adaptive room response equalization system are presented. In particular, three low-complexity linear- and quasi-linear-phase graphic equalizers are proposed and deeply examined. Experiments confirm the effectiveness of the proposed GEQs in terms of accuracy, computational complexity, and latency. Successively, a subband adaptive structure is introduced for the development of a multichannel and multiple positions room response equalizer. Experimental results verify the effectiveness of the subband approach in comparison with the single-band case. Finally, a linear-phase crossover network is presented for multichannel systems, showing great results in terms of magnitude flatness, cutoff rates, polar diagram, and phase response. Active noise control (ANC) systems can be designed to reduce the effects of noise pollution and can be used simultaneously with an immersive audio system. The ANC works by creating a sound wave that has an opposite phase with respect to the sound wave of the unwanted noise. The additional sound wave creates destructive interference, which reduces the overall sound level. Finally, this thesis presents an ANC system used for noise reduction. The proposed approach implements an online secondary path estimation and is based on cross-update adaptive filters applied to the primary path estimation that aim at improving the performance of the whole system. The proposed structure allows for a better convergence rate in comparison with a reference algorithm

    Investigations into the Perception of Vertical Interchannel Decorrelation in 3D Surround Sound Reproduction

    Get PDF
    The use of three-dimensional (3D) surround sound systems has seen a rapid increase over recent years. In two-dimensional (2D) loudspeaker formats (i.e. two-channel stereophony (stereo) and 5.1 Surround), horizontal interchannel decorrelation is a well-established technique for controlling the horizontal spread of a phantom image. Use of interchannel decorrelation can also be found within established two-to-five channel upmixing methods (stereo to 5.1). More recently, proprietary algorithms have been developed that perform 2D-to-3D upmixing, which presumably make use of interchannel decorrelation as well; however, it is not currently known how interchannel decorrelation is perceived in the vertical domain. From this, it is considered that formal investigations into the perception of vertical interchannel decorrelation are necessary. Findings from such experiments may contribute to the improved control of a sound source within 3D surround systems (i.e. the vertical spread), in addition to aiding the optimisation of 2D-to-3D upmixing algorithms. The current thesis presents a series of experiments that systematically assess vertical interchannel decorrelation under various conditions. Firstly, a comparison is made between horizontal and vertical interchannel decorrelation, where it is found that vertical decorrelation is weaker than horizontal decorrelation. However, it is also seen that vertical decorrelation can generate a significant increase of vertical image spread (VIS) for some conditions. Following this, vertical decorrelation is assessed for octave-band pink noise stimuli at various azimuth angles to the listener. The results demonstrate that vertical decorrelation is dependent on both frequency and presentation angle – a general relationship between the interchannel cross-correlation (ICC) and VIS is observed for the 500 Hz octave-band and above, and strongest for the 8 kHz octave-band. Objective analysis of these stimuli signals determined that spectral changes at higher frequencies appear to be associated with VIS perception – at 0° azimuth, the 8 and 16 kHz octave-bands demonstrate potential spectral cues, at ±30°, similar cues are seen in the 4, 8 and 16 kHz bands, and from ±110°, cues are featured in the 2, 4, 8 and 16 kHz bands. In the case of the 8 kHz octave-band, it seems that vertical decorrelation causes a ‘filling in’ of vertical localisation notch cues, potentially resulting in ambiguous perception of vertical extent. In contrast, the objective analysis suggests that VIS perception of the 500 Hz and 1 kHz bands may have been related to early reflections in the listening room. From the experiments above, it is demonstrated that the perception of VIS from vertical interchannel decorrelation is frequency-dependent, with high frequencies playing a particularly important role. A following experiment explores the vertical decorrelation of high frequencies only, where it is seen that decorrelation of the 500 Hz octave-band and above produces a similar perception of VIS to broadband decorrelation, whilst improving tonal quality. The results also indicate that decorrelation of the 8 kHz octave-band and above alone can significantly increase VIS, provided the source signal has sufficient high frequency energy. The final experimental chapter of the present thesis aims to provide a controlled assessment of 2D-to-3D upmixing, taking into account the findings of the previous experiments. In general, 2D-to-3D upmixing by vertical interchannel decorrelation had little impact on listener envelopment (LEV), when compared against a level-matched 2D 5.1 reference. Furthermore, amplitude-based decorrelation appeared to be marginally more effective, and ‘high-pass decorrelation’ resulted in slightly better tonal quality for sources that featured greater low frequency energy

    Beiträge zu breitbandigen Freisprechsystemen und ihrer Evaluation

    Get PDF
    This work deals with the advancement of wideband hands-free systems (HFS’s) for mono- and stereophonic cases of application. Furthermore, innovative contributions to the corr. field of quality evaluation are made. The proposed HFS approaches are based on frequency-domain adaptive filtering for system identification, making use of Kalman theory and state-space modeling. Functional enhancement modules are developed in this work, which improve one or more of key quality aspects, aiming at not to harm others. In so doing, these modules can be combined in a flexible way, dependent on the needs at hand. The enhanced monophonic HFS is evaluated according to automotive ITU-T recommendations, to prove its customized efficacy. Furthermore, a novel methodology and techn. framework are introduced in this work to improve the prototyping and evaluation process of automotive HF and in-car-communication (ICC) systems. The monophonic HFS in several configurations hereby acts as device under test (DUT) and is thoroughly investigated, which will show the DUT’s satisfying performance, as well as the advantages of the proposed development process. As current methods for the evaluation of HFS’s in dynamic conditions oftentimes still lack flexibility, reproducibility, and accuracy, this work introduces “Car in a Box” (CiaB) as a novel, improved system for this demanding task. It is able to enhance the development process by performing high-resolution system identification of dynamic electro-acoustical systems. The extracted dyn. impulse response trajectories are then applicable to arbitrary input signals in a synthesis operation. A realistic dynamic automotive auralization of a car cabin interior is available for HFS evaluation. It is shown that this system improves evaluation flexibility at guaranteed reproducibility. In addition, the accuracy of evaluation methods can be increased by having access to exact, realistic imp. resp. trajectories acting as a so-called “ground truth” reference. If CiaB is included into an automotive evaluation setup, there is no need for an acoustical car interior prototype to be present at this stage of development. Hency, CiaB may ease the HFS development process. Dynamic acoustic replicas may be provided including an arbitrary number of acoustic car cabin interiors for multiple developers simultaneously. With CiaB, speech enh. system developers therefore have an evaluation environment at hand, which can adequately replace the real environment.Diese Arbeit beschäftigt sich mit der Weiterentwicklung breitbandiger Freisprechsysteme für mono-/stereophone Anwendungsfälle und liefert innovative Beiträge zu deren Qualitätsmessung. Die vorgestellten Verfahren basieren auf im Frequenzbereich adaptierenden Algorithmen zur Systemidentifikation gemäß Kalman-Theorie in einer Zustandsraumdarstellung. Es werden funktionale Erweiterungsmodule dahingehend entwickelt, dass mindestens eine Qualitätsanforderung verbessert wird, ohne andere eklatant zu verletzen. Diese nach Anforderung flexibel kombinierbaren algorithmischen Erweiterungen werden gemäß Empfehlungen der ITU-T (Rec. P.1110/P.1130) in vorwiegend automotiven Testszenarien getestet und somit deren zielgerichtete Wirksamkeit bestätigt. Es wird eine Methodensammlung und ein technisches System zur verbesserten Prototypentwicklung/Evaluation von automotiven Freisprech- und Innenraumkommunikationssystemen vorgestellt und beispielhaft mit dem monophonen Freisprechsystem in diversen Ausbaustufen zur Anwendung gebracht. Daraus entstehende Vorteile im Entwicklungs- und Testprozess von Sprachverbesserungssystem werden dargelegt und messtechnisch verifiziert. Bestehende Messverfahren zum Verhalten von Freisprechsystemen in zeitvarianten Umgebungen zeigten bisher oft nur ein unzureichendes Maß an Flexibilität, Reproduzierbarkeit und Genauigkeit. Daher wird hier das „Car in a Box“-Verfahren (CiaB) entwickelt und vorgestellt, mit dem zeitvariante elektro-akustische Systeme technisch identifiziert werden können. So gewonnene dynamische Impulsantworten können im Labor in einer Syntheseoperation auf beliebige Eingangsignale angewandt werden, um realistische Testsignale unter dyn. Bedingungen zu erzeugen. Bei diesem Vorgehen wird ein hohes Maß an Flexibilität bei garantierter Reproduzierbarkeit erlangt. Es wird gezeigt, dass die Genauigkeit von darauf basierenden Evaluationsverfahren zudem gesteigert werden kann, da mit dem Vorliegen von exakten, realen Impulsantworten zu jedem Zeitpunkt der Messung eine sogenannte „ground truth“ als Referenz zur Verfügung steht. Bei der Einbindung von CiaB in einen Messaufbau für automotive Freisprechsysteme ist es bedeutsam, dass zu diesem Zeitpunkt das eigentliche Fahrzeug nicht mehr benötigt wird. Es wird gezeigt, dass eine dyn. Fahrzeugakustikumgebung, wie sie im Entwicklungsprozess von automotiven Sprachverbesserungsalgorithmen benötigt wird, in beliebiger Anzahl vollständig und mind. gleichwertig durch CiaB ersetzt werden kann

    Time, Space, Memory: A portfolio of acousmatic compositions

    Get PDF
    This portfolio comprises of a collection of acousmatic works which investigate the role of source bonding in music – the tendency of listeners to relate sounds to their real-world sources and the signifying implication of such a link – with a particular focus on how spatial design can contribute towards source-bonding in the music’s perception as a holistic spatio-sonic entity. A number of compositional strategies, multichannel formats and spatial audio technologies are investigated, with their merits assessed based on their suitability for shaping the qualities of musical space explored. The discussion in this commentary will show how these holistic spaces can have similar qualities of perceived ‘reality’ and ‘abstraction’ to the individual sounds, and how this is investigated in the musical works. I shall also show how the contrasting environmental qualities of these spaces became a source of of inspiration for structuring the development of my music, and how they might evoke subsequent meaning in their experience based on the listener’s understanding of the spatial source bonds

    Proceedings of the EAA Spatial Audio Signal Processing symposium: SASP 2019

    Get PDF
    International audienc

    Improving television sound for people with hearing impairments

    Get PDF
    This thesis investigates how developments in audio for digital television can be utilised to improve the experience of hearing impaired people when watching television. The work has had significant impact on international digital TV broadcast standards; it led to the formation of the UK Clean Audio Forum whose recommendations based on the research have been included in ETSI international standards for digital television, adopted into ITU standards for IPTV and also into EBU and NorDig digital television receiver specifications. In this thesis listening tests are implemented to assess the impact of various processes with a phantom centre channel and with a centre loudspeaker. The impact of non-speech channel attenuation and dynamic range control on speech clarity, sound quality and enjoyment of audio-visual media are investigated for both hearing impaired and non-hearing impaired people. For the first time the impact of acoustical crosstalk in two channel stereo reproduction on intelligibility of speech is quantified using both subjective intelligibility assessments and acoustic measurement techniques with intelligibility benefits of 5.9% found by utilising a centre loudspeaker instead of a phantom centre. A novel implementation of principal component analysis as a pre- broadcast production tool for labelling AV media compatible with a clean audio mix is identified, and two research implementations of accessible audio are documented including an object based implementation of clean audio for live broadcast that has been developed and publicly demonstrated

    Audio for Virtual, Augmented and Mixed Realities: Proceedings of ICSA 2019 ; 5th International Conference on Spatial Audio ; September 26th to 28th, 2019, Ilmenau, Germany

    Get PDF
    The ICSA 2019 focuses on a multidisciplinary bringing together of developers, scientists, users, and content creators of and for spatial audio systems and services. A special focus is on audio for so-called virtual, augmented, and mixed realities. The fields of ICSA 2019 are: - Development and scientific investigation of technical systems and services for spatial audio recording, processing and reproduction / - Creation of content for reproduction via spatial audio systems and services / - Use and application of spatial audio systems and content presentation services / - Media impact of content and spatial audio systems and services from the point of view of media science. The ICSA 2019 is organized by VDT and TU Ilmenau with support of Fraunhofer Institute for Digital Media Technology IDMT
    • …
    corecore