1,171 research outputs found

    Efficient Algorithms for Immersive Audio Rendering Enhancement

    Get PDF
    Il rendering audio immersivo è il processo di creazione di un’esperienza sonora coinvolgente e realistica nello spazio 3D. Nei sistemi audio immersivi, le funzioni di trasferimento relative alla testa (head-related transfer functions, HRTFs) vengono utilizzate per la sintesi binaurale in cuffia poiché esprimono il modo in cui gli esseri umani localizzano una sorgente sonora. Possono essere introdotti algoritmi di interpolazione delle HRTF per ridurre il numero di punti di misura e per creare un movimento del suono affidabile. La riproduzione binaurale può essere eseguita anche dagli altoparlanti. Tuttavia, il coinvolgimento di due o più gli altoparlanti causa il problema del crosstalk. In questo caso, algoritmi di cancellazione del crosstalk (CTC) sono necessari per eliminare i segnali di interferenza indesiderati. In questa tesi, partendo da un'analisi comparativa di metodi di misura delle HRTF, viene proposto un sistema di rendering binaurale basato sull'interpolazione delle HRTF per applicazioni in tempo reale. Il metodo proposto mostra buone prestazioni rispetto a una tecnica di riferimento. L'algoritmo di interpolazione è anche applicato al rendering audio immersivo tramite altoparlanti, aggiungendo un algoritmo di cancellazione del crosstalk fisso, che considera l'ascoltatore in una posizione fissa. Inoltre, un sistema di cancellazione crosstalk adattivo, che include il tracciamento della testa dell'ascoltatore, è analizzato e implementato in tempo reale. Il CTC adattivo implementa una struttura in sottobande e risultati sperimentali dimostrano che un maggiore numero di bande migliora le prestazioni in termini di errore totale e tasso di convergenza. Il sistema di riproduzione e le caratteristiche dell'ambiente di ascolto possono influenzare le prestazioni a causa della loro risposta in frequenza non ideale. L'equalizzazione viene utilizzata per livellare le varie parti dello spettro di frequenze che compongono un segnale audio al fine di ottenere le caratteristiche sonore desiderate. L'equalizzazione può essere manuale, come nel caso dell'equalizzazione grafica, dove il guadagno di ogni banda di frequenza può essere modificato dall'utente, o automatica, la curva di equalizzazione è calcolata automaticamente dopo la misurazione della risposta impulsiva della stanza. L'equalizzazione della risposta ambientale può essere applicata anche ai sistemi multicanale, che utilizzano due o più altoparlanti e la zona di equalizzazione può essere ampliata misurando le risposte impulsive in diversi punti della zona di ascolto. In questa tesi, GEQ efficienti e un sistema adattativo di equalizzazione d'ambiente. In particolare, sono proposti e approfonditi tre equalizzatori grafici a basso costo computazionale e a fase lineare e quasi lineare. Gli esperimenti confermano l'efficacia degli equalizzatori proposti in termini di accuratezza, complessità computazionale e latenza. Successivamente, una struttura adattativa in sottobande è introdotta per lo sviluppo di un sistema di equalizzazione d'ambiente multicanale. I risultati sperimentali verificano l'efficienza dell'approccio in sottobande rispetto al caso a banda singola. Infine, viene presentata una rete crossover a fase lineare per sistemi multicanale, mostrando ottimi risultati in termini di risposta in ampiezza, bande di transizione, risposta polare e risposta in fase. I sistemi di controllo attivo del rumore (ANC) possono essere progettati per ridurre gli effetti dell'inquinamento acustico e possono essere utilizzati contemporaneamente a un sistema audio immersivo. L'ANC funziona creando un'onda sonora in opposizione di fase rispetto all'onda sonora in arrivo. Il livello sonoro complessivo viene così ridotto grazie all'interferenza distruttiva. Infine, questa tesi presenta un sistema ANC utilizzato per la riduzione del rumore. L’approccio proposto implementa una stima online del percorso secondario e si basa su filtri adattativi in sottobande applicati alla stima del percorso primario che mirano a migliorare le prestazioni dell’intero sistema. La struttura proposta garantisce un tasso di convergenza migliore rispetto all'algoritmo di riferimento.Immersive audio rendering is the process of creating an engaging and realistic sound experience in 3D space. In immersive audio systems, the head-related transfer functions (HRTFs) are used for binaural synthesis over headphones since they express how humans localize a sound source. HRTF interpolation algorithms can be introduced for reducing the number of measurement points and creating a reliable sound movement. Binaural reproduction can be also performed by loudspeakers. However, the involvement of two or more loudspeakers causes the problem of crosstalk. In this case, crosstalk cancellation (CTC) algorithms are needed to delete unwanted interference signals. In this thesis, starting from a comparative analysis of HRTF measurement techniques, a binaural rendering system based on HRTF interpolation is proposed and evaluated for real-time applications. The proposed method shows good performance in comparison with a reference technique. The interpolation algorithm is also applied for immersive audio rendering over loudspeakers, by adding a fixed crosstalk cancellation algorithm, which assumes that the listener is in a fixed position. In addition, an adaptive crosstalk cancellation system, which includes the tracking of the listener's head, is analyzed and a real-time implementation is presented. The adaptive CTC implements a subband structure and experimental results prove that a higher number of bands improves the performance in terms of total error and convergence rate. The reproduction system and the characteristics of the listening room may affect the performance due to their non-ideal frequency response. Audio equalization is used to adjust the balance of different audio frequencies in order to achieve desired sound characteristics. The equalization can be manual, such as in the case of graphic equalization, where the gain of each frequency band can be modified by the user, or automatic, where the equalization curve is automatically calculated after the room impulse response measurement. The room response equalization can be also applied to multichannel systems, which employ two or more loudspeakers, and the equalization zone can be enlarged by measuring the impulse responses in different points of the listening zone. In this thesis, efficient graphic equalizers (GEQs), and an adaptive room response equalization system are presented. In particular, three low-complexity linear- and quasi-linear-phase graphic equalizers are proposed and deeply examined. Experiments confirm the effectiveness of the proposed GEQs in terms of accuracy, computational complexity, and latency. Successively, a subband adaptive structure is introduced for the development of a multichannel and multiple positions room response equalizer. Experimental results verify the effectiveness of the subband approach in comparison with the single-band case. Finally, a linear-phase crossover network is presented for multichannel systems, showing great results in terms of magnitude flatness, cutoff rates, polar diagram, and phase response. Active noise control (ANC) systems can be designed to reduce the effects of noise pollution and can be used simultaneously with an immersive audio system. The ANC works by creating a sound wave that has an opposite phase with respect to the sound wave of the unwanted noise. The additional sound wave creates destructive interference, which reduces the overall sound level. Finally, this thesis presents an ANC system used for noise reduction. The proposed approach implements an online secondary path estimation and is based on cross-update adaptive filters applied to the primary path estimation that aim at improving the performance of the whole system. The proposed structure allows for a better convergence rate in comparison with a reference algorithm

    Adaptive Filtered-x Algorithms for Room Equalization Based on Block-Based Combination Schemes

    Full text link
    (c) 2016 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other users, including reprinting/ republishing this material for advertising or promotional purposes, creating new collective works for resale or redistribution to servers or lists, or reuse of any copyrighted components of this work in other works.[EN] Room equalization has become essential for sound reproduction systems to provide the listener with the desired acoustical sensation. Recently, adaptive filters have been proposed as an effective tool in the core of these systems. In this context, this paper introduces different novel schemes based on the combination of adaptive filters idea: a versatile and flexible approach that permits obtaining adaptive schemes combining the capabilities of several independent adaptive filters. In this way, we have investigated the advantages of a scheme called combination of block-based adaptive filters which allows a blockwise combination splitting the adaptive filters into nonoverlapping blocks. This idea was previously applied to the plant identification problem, but has to be properly modified to obtain a suitable behavior in the equalization application. Moreover, we propose a scheme with the aim of further improving the equalization performance using the a priori knowledge of the energy distribution of the optimal inverse filter, where the block filters are chosen to fit with the coefficients energy distribution. Furthermore, the biased block-based filter is also introduced as a particular case of the combination scheme, especially suited for low signal-to-noise ratios (SNRs) or sparse scenarios. Although the combined schemes can be employed with any kind of adaptive filter, we employ the filtered-x improved proportionate normalized least mean square algorithm as basis of the proposed algorithms, allowing to introduce a novel combination scheme based on partitioned block schemes where different blocks of the adaptive filter use different parameter settings. Several experiments are included to evaluate the proposed algorithms in terms of convergence speed and steady-state behavior for different degrees of sparseness and SNRs.The work of L. A. Azpicueta-Ruiz was supported in part by the Comtmidad de Madrid through CASI-CAM-CM under Grant S2013/ICE-2845, in part by the Spanish Ministry of Economy and Competitiveness through DAMA under Grant TIN2015-70308-REDT, and Grant TEC2014-52289-R, and in part by the European Union. The work of L. Fuster, M. Ferrer, and M. de Diego was supported in part by EU together with the Spanish Government under Grant TEC2015-67387-C4-1-R (MINECO/FEDER), and in part by the Cieneralitat Valenciana under Grant PROMETEOII/2014/003. The associate editor coordinating the review of this manuscript and approving it for publication was Prof. Simon Dodo.Fuster Criado, L.; Diego Antón, MD.; Azpicueta-Ruiz, LA.; Ferrer Contreras, M. (2016). Adaptive Filtered-x Algorithms for Room Equalization Based on Block-Based Combination Schemes. IEEE/ACM Transactions on Audio, Speech and Language Processing. 24(10):1732-1745. https://doi.org/10.1109/TASLP.2016.2583065S17321745241

    System Identification with Applications in Speech Enhancement

    No full text
    As the increasing popularity of integrating hands-free telephony on mobile portable devices and the rapid development of voice over internet protocol, identification of acoustic systems has become desirable for compensating distortions introduced to speech signals during transmission, and hence enhancing the speech quality. The objective of this research is to develop system identification algorithms for speech enhancement applications including network echo cancellation and speech dereverberation. A supervised adaptive algorithm for sparse system identification is developed for network echo cancellation. Based on the framework of selective-tap updating scheme on the normalized least mean squares algorithm, the MMax and sparse partial update tap-selection strategies are exploited in the frequency domain to achieve fast convergence performance with low computational complexity. Through demonstrating how the sparseness of the network impulse response varies in the transformed domain, the multidelay filtering structure is incorporated to reduce the algorithmic delay. Blind identification of SIMO acoustic systems for speech dereverberation in the presence of common zeros is then investigated. First, the problem of common zeros is defined and extended to include the presence of near-common zeros. Two clustering algorithms are developed to quantify the number of these zeros so as to facilitate the study of their effect on blind system identification and speech dereverberation. To mitigate such effect, two algorithms are developed where the two-stage algorithm based on channel decomposition identifies common and non-common zeros sequentially; and the forced spectral diversity approach combines spectral shaping filters and channel undermodelling for deriving a modified system that leads to an improved dereverberation performance. Additionally, a solution to the scale factor ambiguity problem in subband-based blind system identification is developed, which motivates further research on subbandbased dereverberation techniques. Comprehensive simulations and discussions demonstrate the effectiveness of the aforementioned algorithms. A discussion on possible directions of prospective research on system identification techniques concludes this thesis

    Distributed and Collaborative Processing of Audio Signals: Algorithms, Tools and Applications

    Full text link
    Tesis por compendio[ES] Esta tesis se enmarca en el campo de las Tecnologías de la Información y las Comunicaciones (TIC), especialmente en el área del procesado digital de la señal. En la actualidad, y debido al auge del Internet de los cosas (IoT), existe un creciente interés por las redes de sensores inalámbricos (WSN), es decir, redes compuestas de diferentes tipos de dispositivos específicamente distribuidos en una determinada zona para realizar diferentes tareas de procesado de señal. Estos dispositivos o nodos suelen estar equipados con transductores electroacústicos así como con potentes y eficientes procesadores con capacidad de comunicación. En el caso particular de las redes de sensores acústicos (ASN), los nodos se dedican a resolver diferentes tareas de procesado de señales acústicas. El desarrollo de potentes sistemas de procesado centralizado han permitido aumentar el número de canales de audio, ampliar el área de control o implementar algoritmos más complejos. En la mayoría de los casos, una topología de ASN distribuida puede ser deseable debido a varios factores tales como el número limitado de canales utilizados por los dispositivos de adquisición y reproducción de audio, la conveniencia de un sistema escalable o las altas exigencias computacionales de los sistemas centralizados. Todos estos aspectos pueden llevar a la utilización de nuevas técnicas de procesado distribuido de señales con el fin de aplicarlas en ASNs. Para ello, una de las principales aportaciones de esta tesis es el desarrollo de algoritmos de filtrado adaptativo para sistemas de audio multicanal en redes distribuidas. Es importante tener en cuenta que, para aplicaciones de control del campo sonoro (SFC), como el control activo de ruido (ANC) o la ecualización activa de ruido (ANE), los nodos acústicos deben estar equipados con actuadores con el fin de controlar y modificar el campo sonoro. Sin embargo, la mayoría de las propuestas de redes distribuidas adaptativas utilizadas para resolver problemas de control del campo sonoro no tienen en cuenta que los nodos pueden interferir o modificar el comportamiento del resto. Por lo tanto, otra contribución destacable de esta tesis se centra en el análisis de cómo el sistema acústico afecta el comportamiento de los nodos dentro de una ASN. En los casos en que el entorno acústico afecta negativamente a la estabilidad del sistema, se han propuesto varias estrategias distribuidas para resolver el problema de interferencia acústica con el objetivo de estabilizar los sistemas de ANC. En el diseño de los algoritmos distribuidos también se han tenido en cuenta aspectos de implementación práctica. Además, con el objetivo de crear perfiles de ecualización diferentes en zonas de escucha independientes en presencia de ruidos multitonales, se han presentado varios algoritmos distribuidos de ANE en banda estrecha y banda ancha sobre una ASN con una comunicación colaborativa y compuesta por nodos acústicos. Se presentan además resultados experimentales para validar el uso de los algoritmos distribuidos propuestos en el trabajo para aplicaciones prácticas. Para ello, se ha diseñado un software de simulación acústica que permite analizar el rendimiento de los algoritmos desarrollados en la tesis. Finalmente, se ha realizado una implementación práctica que permite ejecutar aplicaciones multicanal de SFC. Para ello, se ha desarrollado un prototipo en tiempo real que controla las aplicaciones de ANC y ANE utilizando nodos acústicos colaborativos. El prototipo consiste en dos sistemas de control de audio personalizado (PAC) compuestos por un asiento de coche y un nodo acústico, el cual está equipado con dos altavoces, dos micrófonos y un procesador con capacidad de comunicación entre los dos nodos. De esta manera, es posible crear dos zonas independientes de control de ruido que mejoran el confort acústico del usuario sin necesidad de utilizar auriculares.[CA] Aquesta tesi s'emmarca en el camp de les Tecnologies de la Informació i les Comunicacions (TIC), especialment en l'àrea del processament digital del senyal. En l'actualitat, i a causa de l'auge de la Internet dels coses (IoT), existeix un creixent interés per les xarxes de sensors sense fils (WSN), és a dir, xarxes compostes de diferents tipus de dispositius específicament distribuïts en una determinada zona per a fer diferents tasques de processament de senyal. Aquests dispositius o nodes solen estar equipats amb transductors electroacústics així com amb potents i eficients processadors amb capacitat de comunicació. En el cas particular de les xarxes de sensors acústics (ASN), els nodes es dediquen a resoldre diferents tasques de processament de senyals acústics. El desenvolupament de potents sistemes de processament centralitzat han permés augmentar el nombre de canals d'àudio, ampliar l'àrea de control o implementar algorismes més complexos. En la majoria dels casos, una topologia de ASN distribuïda pot ser desitjable a causa de diversos factors tals com el nombre limitat de canals utilitzats pels dispositius d'adquisició i reproducció d'àudio, la conveniència d'un sistema escalable o les altes exigències computacionals dels sistemes centralitzats. Tots aquests aspectes poden portar a la utilització de noves tècniques de processament distribuït de senyals amb la finalitat d'aplicar-les en ASNs. Per a això, una de les principals aportacions d'aquesta tesi és el desenvolupament d'algorismes de filtrat adaptatiu per a sistemes d'àudio multicanal en xarxes distribuïdes. És important tindre en compte que, per a aplicacions de control del camp sonor (SFC), com el control actiu de soroll (ANC) o l'equalització activa de soroll (ANE), els nodes acústics han d'estar equipats amb actuadors amb la finalitat de controlar i modificar el camp sonor. No obstant això, la majoria de les propostes de xarxes distribuïdes adaptatives utilitzades per a resoldre problemes de control del camp sonor no tenen en compte que els nodes poden modificar el comportament de la resta. Per tant, una altra contribució destacable d'aquesta tesi se centra en l'anàlisi de com el sistema acústic afecta el comportament dels nodes dins d'una ASN. En els casos en què l'entorn acústic afecta negativament a l'estabilitat del sistema, s'han proposat diverses estratègies distribuïdes per a resoldre el problema d'interferència acústica amb l'objectiu d'estabilitzar els sistemes de ANC. En el disseny dels algorismes distribuïts també s'han tingut en compte aspectes d'implementació pràctica. A més, amb l'objectiu de crear perfils d'equalització diferents en zones d'escolta independents en presència de sorolls multitonales, s'han presentat diversos algorismes distribuïts de ANE en banda estreta i banda ampla sobre una ASN amb una comunicació col·laborativa i composta per nodes acústics. Es presenten a més resultats experimentals per a validar l'ús dels algorismes distribuïts proposats en el treball per a aplicacions pràctiques. Per a això, s'ha dissenyat un programari de simulació acústica que permet analitzar el rendiment dels algorismes desenvolupats en la tesi. Finalment, s'ha realitzat una implementació pràctica que permet executar aplicacions multicanal de SFC. Per a això, s'ha desenvolupat un prototip en temps real que controla les aplicacions de ANC i ANE utilitzant nodes acústics col·laboratius. El prototip consisteix en dos sistemes de control d'àudio personalitzat (PAC) compostos per un seient de cotxe i un node acústic, el qual està equipat amb dos altaveus, dos micròfons i un processador amb capacitat de comunicació entre els dos nodes. D'aquesta manera, és possible crear dues zones independents de control de soroll que milloren el confort acústic de l'usuari sense necessitat d'utilitzar auriculars.[EN] This thesis fits into the field of Information and Communications Technology (ICT), especially in the area of digital signal processing. Nowadays and due to the rise of the Internet of Things (IoT), there is a growing interest in wireless sensor networks (WSN), that is, networks composed of different types of devices specifically distributed in some area to perform different signal processsing tasks. These devices, also referred to as nodes, are usually equipped with electroacoustic transducers as well as powerful and efficient processors with communication capability. In the particular case of acoustic sensor networks (ASN), nodes are dedicated to solving different acoustic signal processing tasks. These audio signal processing applications have been undergone a major development in recent years due in part to the advances made in computer hardware and software. The development of powerful centralized processing systems has allowed the number of audio channels to be increased, the control area to be extended or more complex algorithmms to be implemented. In most cases, a distributed ASN topology can be desirable due to several factors such as the limited number of channels used by the sound acquisition and reproduction devices, the convenience of a scalable system or the high computational demands of a centralized fashion. All these aspects may lead to the use of novel distributed signal processing techniques with the aim to be applied over ASNs. To this end, one of the main contributions of this dissertation is the development of adaptive filtering algorithms for multichannel sound systems over distributed networks. Note that, for sound field control (SFC) applications, such as active noise control (ANC) or active noise equalization (ANE), acoustic nodes must be not only equipped with sensors but also with actuators in order to control and modify the sound field. However, most of the adaptive distributed networks approaches used to solve soundfield control problems do not take into account that the nodes may interfere or modify the behaviour of the rest. Therefore, other important contribution of this thesis is focused on analyzing how the acoustic system affects the behavior of the nodes within an ASN. In cases where the acoustic environment adversely affects the system stability, several distributed strategies have been proposed for solving the acoustic interference problem with the aim to stabilize ANC control systems. These strategies are based on both collaborative and non-collaborative approaches. Implementation aspects such as hardware constraints, sensor locations, convergenge rate or computational and communication burden, have been also considered on the design of the distributed algorithms. Moreover and with the aim to create independent-zone equalization profiles in the presence of multi-tonal noises, distributed narrowband and broadband ANE algorithms over an ASN with a collaborative learning and composed of acoustic nodes have been presented. Experimental results are presented to validate the use of the distributed algorithms proposed in the work for practical applications. For this purpose, an acoustic simulation software has been specifically designed to analyze the performance of the developed algorithms. Finally, the performance of the proposed distributed algorithms for multichannel SFC applications has been evaluated by means of a real practical implementation. To this end, a real-time prototype that controls both ANC and ANE applications by using collaborative acoustic nodes has been developed. The prototype consists of two personal audio control (PAC) systems composed of a car seat and an acoustic node, which is equipped with two loudspeakers, two microphones and a processor with communications capability. In this way, it is possible to create two independent noise control zones improving the acoustic comfort of the user without the use of headphones.Antoñanzas Manuel, C. (2019). Distributed and Collaborative Processing of Audio Signals: Algorithms, Tools and Applications [Tesis doctoral no publicada]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/130209TESISCompendi

    In Car Audio

    Get PDF
    This chapter presents implementations of advanced in Car Audio Applications. The system is composed by three main different applications regarding the In Car listening and communication experience. Starting from a high level description of the algorithms, several implementations on different levels of hardware abstraction are presented, along with empirical results on both the design process undergone and the performance results achieved

    PERFORMANCE IMPROVEMENT OF MULTICHANNEL AUDIO BY GRAPHICS PROCESSING UNITS

    Full text link
    Multichannel acoustic signal processing has undergone major development in recent years due to the increased complexity of current audio processing applications. People want to collaborate through communication with the feeling of being together and sharing the same environment, what is considered as Immersive Audio Schemes. In this phenomenon, several acoustic e ects are involved: 3D spatial sound, room compensation, crosstalk cancelation, sound source localization, among others. However, high computing capacity is required to achieve any of these e ects in a real large-scale system, what represents a considerable limitation for real-time applications. The increase of the computational capacity has been historically linked to the number of transistors in a chip. However, nowadays the improvements in the computational capacity are mainly given by increasing the number of processing units, i.e expanding parallelism in computing. This is the case of the Graphics Processing Units (GPUs), that own now thousands of computing cores. GPUs were traditionally related to graphic or image applications, but new releases in the GPU programming environments, CUDA or OpenCL, allowed that most applications were computationally accelerated in elds beyond graphics. This thesis aims to demonstrate that GPUs are totally valid tools to carry out audio applications that require high computational resources. To this end, di erent applications in the eld of audio processing are studied and performed using GPUs. This manuscript also analyzes and solves possible limitations in each GPU-based implementation both from the acoustic point of view as from the computational point of view. In this document, we have addressed the following problems: Most of audio applications are based on massive ltering. Thus, the rst implementation to undertake is a fundamental operation in the audio processing: the convolution. It has been rst developed as a computational kernel and afterwards used for an application that combines multiples convolutions concurrently: generalized crosstalk cancellation and equalization. The proposed implementation can successfully manage two di erent and common situations: size of bu ers that are much larger than the size of the lters and size of bu ers that are much smaller than the size of the lters. Two spatial audio applications that use the GPU as a co-processor have been developed from the massive multichannel ltering. First application deals with binaural audio. Its main feature is that this application is able to synthesize sound sources in spatial positions that are not included in the database of HRTF and to generate smoothly movements of sound sources. Both features were designed after di erent tests (objective and subjective). The performance regarding number of sound source that could be rendered in real time was assessed on GPUs with di erent GPU architectures. A similar performance is measured in a Wave Field Synthesis system (second spatial audio application) that is composed of 96 loudspeakers. The proposed GPU-based implementation is able to reduce the room e ects during the sound source rendering. A well-known approach for sound source localization in noisy and reverberant environments is also addressed on a multi-GPU system. This is the case of the Steered Response Power with Phase Transform (SRPPHAT) algorithm. Since localization accuracy can be improved by using high-resolution spatial grids and a high number of microphones, accurate acoustic localization systems require high computational power. The solutions implemented in this thesis are evaluated both from localization and from computational performance points of view, taking into account different acoustic environments, and always from a real-time implementation perspective. Finally, This manuscript addresses also massive multichannel ltering when the lters present an In nite Impulse Response (IIR). Two cases are analyzed in this manuscript: 1) IIR lters composed of multiple secondorder sections, and 2) IIR lters that presents an allpass response. Both cases are used to develop and accelerate two di erent applications: 1) to execute multiple Equalizations in a WFS system, and 2) to reduce the dynamic range in an audio signal.Belloch Rodríguez, JA. (2014). PERFORMANCE IMPROVEMENT OF MULTICHANNEL AUDIO BY GRAPHICS PROCESSING UNITS [Tesis doctoral]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/40651TESISPremios Extraordinarios de tesis doctorale

    Real-time massive convolution for audio applications on GPU

    Full text link
    [EN] Massive convolution is the basic operation in multichannel acoustic signal processing. This field has experienced a major development in recent years. One reason for this has been the increase in the number of sound sources used in playback applications available to users. Another reason is the growing need to incorporate new effects and to improve the hearing experience. Massive convolution requires high computing capacity. GPUs offer the possibility of parallelizing these operations. This allows us to obtain the processing result in much shorter time and to free up CPU resources. One important aspect lies in the possibility of overlapping the transfer of data from CPU to GPU and vice versa with the computation, in order to carry out real-time applications. Thus, a synthesis of 3D sound scenes could be achieved with only a peer-to-peer music streaming environment using a simple GPU in your computer, while the CPU in the computer is being used for other tasks. Nowadays, these effects are obtained in theaters or funfairs at a very high cost, requiring a large quantity of resources. Thus, our work focuses on two mains points: to describe an efficient massive convolution implementation and to incorporate this task to real-time multichannel-sound applications. © 2011 Springer Science+Business Media, LLC.This work was partially supported by the Spanish Ministerio de Ciencia e Innovacion (Projects TIN2008-06570-C04-02 and TEC2009-13741), Universidad Politecnica de Valencia through PAID-05-09 and Generalitat Valenciana through project PROMETEO/2009/2013Belloch Rodríguez, JA.; Gonzalez, A.; Martínez Zaldívar, FJ.; Vidal Maciá, AM. (2011). Real-time massive convolution for audio applications on GPU. Journal of Supercomputing. 58(3):449-457. https://doi.org/10.1007/s11227-011-0610-8S449457583Spors S, Rabenstein R, Herbordt W (2007) Active listening room compensation for massive multichannel sound reproduction system using wave-domain adaptive filtering. J Acoust Soc Am 122:354–369Huang Y, Benesty J, Chen J (2008) Generalized crosstalk cancellation and equalization using multiple loudspeakers for 3D sound reproduction at the ears of multiple listeners. In: IEEE int conference on acoustics, speech and signal processing, Las Vegas, USA, pp 405–408Cowan B, Kapralos B (2008) Spatial sound for video games and virtual environments utilizing real-time GPU-based convolution. In: Proceedings of the ACM FuturePlay 2008 international conference on the future of game design and technology, Toronto, Ontario, Canada, November 3–5Belloch JA, Vidal AM, Martinez-Zaldivar FJ, Gonzalez A (2010) Multichannel acoustic signal processing on GPU. In: Proceedings of the 10th international conference on computational and mathematical methods in science and engineering, vol 1. Almeria, Spain, June 26–30, pp 181–187Cowan B, Kapralos B (2009) GPU-based one-dimensional convolution for real-time spatial sound generation. Sch J 3(5)Soliman SS, Mandyam DS, Srinath MD (1997) Continuous and discrete signals and systems. Prentice Hall, New YorkOppenheim AV, Willsky AS, Hamid Nawab S (1996) Signals and systems. Prentice Hall, New YorkopenGL: http://www.opengl.org/MKL library: http://software.intel.com/en-us/intel-mkl/MKL library: http://software.intel.com/en-us/intel-ipp/CUFFT library: http://developer.download.nvidia.com/compute/cuda/3_1/toolkit/docs/CUFFT_Library_3.1.pdfCUDA Toolkit 3.1: http://developer.nvidia.com/object/cuda_3_1_downloads.htmlCUDA Toolkit 3.2: http://developer.nvidia.com/object/cuda_3_1_downloads.htmlDatasheet of AC’97 SoundMAX Codec: http://www.xilinx.com/products/boards/ml505/datasheets/87560554AD1981B_c.pd
    • …
    corecore