15 research outputs found

    On the applicability of models for outdoor sound (A)

    Get PDF

    PERFORMANCE IMPROVEMENT OF MULTICHANNEL AUDIO BY GRAPHICS PROCESSING UNITS

    Full text link
    Multichannel acoustic signal processing has undergone major development in recent years due to the increased complexity of current audio processing applications. People want to collaborate through communication with the feeling of being together and sharing the same environment, what is considered as Immersive Audio Schemes. In this phenomenon, several acoustic e ects are involved: 3D spatial sound, room compensation, crosstalk cancelation, sound source localization, among others. However, high computing capacity is required to achieve any of these e ects in a real large-scale system, what represents a considerable limitation for real-time applications. The increase of the computational capacity has been historically linked to the number of transistors in a chip. However, nowadays the improvements in the computational capacity are mainly given by increasing the number of processing units, i.e expanding parallelism in computing. This is the case of the Graphics Processing Units (GPUs), that own now thousands of computing cores. GPUs were traditionally related to graphic or image applications, but new releases in the GPU programming environments, CUDA or OpenCL, allowed that most applications were computationally accelerated in elds beyond graphics. This thesis aims to demonstrate that GPUs are totally valid tools to carry out audio applications that require high computational resources. To this end, di erent applications in the eld of audio processing are studied and performed using GPUs. This manuscript also analyzes and solves possible limitations in each GPU-based implementation both from the acoustic point of view as from the computational point of view. In this document, we have addressed the following problems: Most of audio applications are based on massive ltering. Thus, the rst implementation to undertake is a fundamental operation in the audio processing: the convolution. It has been rst developed as a computational kernel and afterwards used for an application that combines multiples convolutions concurrently: generalized crosstalk cancellation and equalization. The proposed implementation can successfully manage two di erent and common situations: size of bu ers that are much larger than the size of the lters and size of bu ers that are much smaller than the size of the lters. Two spatial audio applications that use the GPU as a co-processor have been developed from the massive multichannel ltering. First application deals with binaural audio. Its main feature is that this application is able to synthesize sound sources in spatial positions that are not included in the database of HRTF and to generate smoothly movements of sound sources. Both features were designed after di erent tests (objective and subjective). The performance regarding number of sound source that could be rendered in real time was assessed on GPUs with di erent GPU architectures. A similar performance is measured in a Wave Field Synthesis system (second spatial audio application) that is composed of 96 loudspeakers. The proposed GPU-based implementation is able to reduce the room e ects during the sound source rendering. A well-known approach for sound source localization in noisy and reverberant environments is also addressed on a multi-GPU system. This is the case of the Steered Response Power with Phase Transform (SRPPHAT) algorithm. Since localization accuracy can be improved by using high-resolution spatial grids and a high number of microphones, accurate acoustic localization systems require high computational power. The solutions implemented in this thesis are evaluated both from localization and from computational performance points of view, taking into account different acoustic environments, and always from a real-time implementation perspective. Finally, This manuscript addresses also massive multichannel ltering when the lters present an In nite Impulse Response (IIR). Two cases are analyzed in this manuscript: 1) IIR lters composed of multiple secondorder sections, and 2) IIR lters that presents an allpass response. Both cases are used to develop and accelerate two di erent applications: 1) to execute multiple Equalizations in a WFS system, and 2) to reduce the dynamic range in an audio signal.Belloch RodrĂ­guez, JA. (2014). PERFORMANCE IMPROVEMENT OF MULTICHANNEL AUDIO BY GRAPHICS PROCESSING UNITS [Tesis doctoral]. Universitat PolitĂšcnica de ValĂšncia. https://doi.org/10.4995/Thesis/10251/40651TESISPremios Extraordinarios de tesis doctorale

    Effizientes binaurales Rendering von virtuellen akustischen RealitÀten : technische und wahrnehmungsbezogene Konzepte

    Get PDF
    Binaural rendering aims to immerse the listener in a virtual acoustic scene, making it an essential method for spatial audio reproduction in virtual or augmented reality (VR/AR) applications. The growing interest and research in VR/AR solutions yielded many different methods for the binaural rendering of virtual acoustic realities, yet all of them share the fundamental idea that the auditory experience of any sound field can be reproduced by reconstructing its sound pressure at the listener's eardrums. This thesis addresses various state-of-the-art methods for 3 or 6 degrees of freedom (DoF) binaural rendering, technical approaches applied in the context of headphone-based virtual acoustic realities, and recent technical and psychoacoustic research questions in the field of binaural technology. The publications collected in this dissertation focus on technical or perceptual concepts and methods for efficient binaural rendering, which has become increasingly important in research and development due to the rising popularity of mobile consumer VR/AR devices and applications. The thesis is organized into five research topics: Head-Related Transfer Function Processing and Interpolation, Parametric Spatial Audio, Auditory Distance Perception of Nearby Sound Sources, Binaural Rendering of Spherical Microphone Array Data, and Voice Directivity. The results of the studies included in this dissertation extend the current state of research in the respective research topic, answer specific psychoacoustic research questions and thereby yield a better understanding of basic spatial hearing processes, and provide concepts, methods, and design parameters for the future implementation of technically and perceptually efficient binaural rendering.Binaurales Rendering zielt darauf ab, dass der Hörer in eine virtuelle akustische Szene eintaucht, und ist somit eine wesentliche Methode fĂŒr die rĂ€umliche Audiowiedergabe in Anwendungen der virtuellen RealitĂ€t (VR) oder der erweiterten RealitĂ€t (AR – aus dem Englischen Augmented Reality). Das wachsende Interesse und die zunehmende Forschung an VR/AR-Lösungen fĂŒhrte zu vielen verschiedenen Methoden fĂŒr das binaurale Rendering virtueller akustischer RealitĂ€ten, die jedoch alle die grundlegende Idee teilen, dass das Hörerlebnis eines beliebigen Schallfeldes durch die Rekonstruktion seines Schalldrucks am Trommelfell des Hörers reproduziert werden kann. Diese Arbeit befasst sich mit verschiedenen modernsten Methoden zur binauralen Wiedergabe mit 3 oder 6 Freiheitsgraden (DoF – aus dem Englischen Degree of Freedom), mit technischen AnsĂ€tzen, die im Kontext kopfhörerbasierter virtueller akustischer RealitĂ€ten angewandt werden, und mit aktuellen technischen und psychoakustischen Forschungsfragen auf dem Gebiet der Binauraltechnik. Die in dieser Dissertation gesammelten Publikationen befassen sich mit technischen oder wahrnehmungsbezogenen Konzepten und Methoden fĂŒr effizientes binaurales Rendering, was in der Forschung und Entwicklung aufgrund der zunehmenden Beliebtheit von mobilen Verbraucher-VR/AR-GerĂ€ten und -Anwendungen zunehmend an Relevanz gewonnen hat. Die Arbeit ist in fĂŒnf Forschungsthemen gegliedert: Verarbeitung und Interpolation von AußenohrĂŒbertragungsfunktionen, parametrisches rĂ€umliches Audio, auditive Entfernungswahrnehmung ohrnaher Schallquellen, binaurales Rendering von sphĂ€rischen Mikrofonarraydaten und Richtcharakteristik der Stimme. Die Ergebnisse der in dieser Dissertation enthaltenen Studien erweitern den aktuellen Forschungsstand im jeweiligen Forschungsfeld, beantworten spezifische psychoakustische Forschungsfragen und fĂŒhren damit zu einem besseren VerstĂ€ndnis grundlegender rĂ€umlicher Hörprozesse, und liefern Konzepte, Methoden und Gestaltungsparameter fĂŒr die zukĂŒnftige Umsetzung eines technisch und wahrnehmungsbezogen effizienten binauralen Renderings.BMBF, 03FH014IX5, NatĂŒrliche raumbezogene Darbietung selbsterzeugter Schallereignisse in virtuellen auditiven Umgebungen (NarDasS

    Fine‐structure processing, frequency selectivity and speech perception in hearing‐impaired listeners

    Get PDF
    corecore