81 research outputs found

    Efficient Algorithms for Immersive Audio Rendering Enhancement

    Get PDF
    Il rendering audio immersivo è il processo di creazione di un’esperienza sonora coinvolgente e realistica nello spazio 3D. Nei sistemi audio immersivi, le funzioni di trasferimento relative alla testa (head-related transfer functions, HRTFs) vengono utilizzate per la sintesi binaurale in cuffia poiché esprimono il modo in cui gli esseri umani localizzano una sorgente sonora. Possono essere introdotti algoritmi di interpolazione delle HRTF per ridurre il numero di punti di misura e per creare un movimento del suono affidabile. La riproduzione binaurale può essere eseguita anche dagli altoparlanti. Tuttavia, il coinvolgimento di due o più gli altoparlanti causa il problema del crosstalk. In questo caso, algoritmi di cancellazione del crosstalk (CTC) sono necessari per eliminare i segnali di interferenza indesiderati. In questa tesi, partendo da un'analisi comparativa di metodi di misura delle HRTF, viene proposto un sistema di rendering binaurale basato sull'interpolazione delle HRTF per applicazioni in tempo reale. Il metodo proposto mostra buone prestazioni rispetto a una tecnica di riferimento. L'algoritmo di interpolazione è anche applicato al rendering audio immersivo tramite altoparlanti, aggiungendo un algoritmo di cancellazione del crosstalk fisso, che considera l'ascoltatore in una posizione fissa. Inoltre, un sistema di cancellazione crosstalk adattivo, che include il tracciamento della testa dell'ascoltatore, è analizzato e implementato in tempo reale. Il CTC adattivo implementa una struttura in sottobande e risultati sperimentali dimostrano che un maggiore numero di bande migliora le prestazioni in termini di errore totale e tasso di convergenza. Il sistema di riproduzione e le caratteristiche dell'ambiente di ascolto possono influenzare le prestazioni a causa della loro risposta in frequenza non ideale. L'equalizzazione viene utilizzata per livellare le varie parti dello spettro di frequenze che compongono un segnale audio al fine di ottenere le caratteristiche sonore desiderate. L'equalizzazione può essere manuale, come nel caso dell'equalizzazione grafica, dove il guadagno di ogni banda di frequenza può essere modificato dall'utente, o automatica, la curva di equalizzazione è calcolata automaticamente dopo la misurazione della risposta impulsiva della stanza. L'equalizzazione della risposta ambientale può essere applicata anche ai sistemi multicanale, che utilizzano due o più altoparlanti e la zona di equalizzazione può essere ampliata misurando le risposte impulsive in diversi punti della zona di ascolto. In questa tesi, GEQ efficienti e un sistema adattativo di equalizzazione d'ambiente. In particolare, sono proposti e approfonditi tre equalizzatori grafici a basso costo computazionale e a fase lineare e quasi lineare. Gli esperimenti confermano l'efficacia degli equalizzatori proposti in termini di accuratezza, complessità computazionale e latenza. Successivamente, una struttura adattativa in sottobande è introdotta per lo sviluppo di un sistema di equalizzazione d'ambiente multicanale. I risultati sperimentali verificano l'efficienza dell'approccio in sottobande rispetto al caso a banda singola. Infine, viene presentata una rete crossover a fase lineare per sistemi multicanale, mostrando ottimi risultati in termini di risposta in ampiezza, bande di transizione, risposta polare e risposta in fase. I sistemi di controllo attivo del rumore (ANC) possono essere progettati per ridurre gli effetti dell'inquinamento acustico e possono essere utilizzati contemporaneamente a un sistema audio immersivo. L'ANC funziona creando un'onda sonora in opposizione di fase rispetto all'onda sonora in arrivo. Il livello sonoro complessivo viene così ridotto grazie all'interferenza distruttiva. Infine, questa tesi presenta un sistema ANC utilizzato per la riduzione del rumore. L’approccio proposto implementa una stima online del percorso secondario e si basa su filtri adattativi in sottobande applicati alla stima del percorso primario che mirano a migliorare le prestazioni dell’intero sistema. La struttura proposta garantisce un tasso di convergenza migliore rispetto all'algoritmo di riferimento.Immersive audio rendering is the process of creating an engaging and realistic sound experience in 3D space. In immersive audio systems, the head-related transfer functions (HRTFs) are used for binaural synthesis over headphones since they express how humans localize a sound source. HRTF interpolation algorithms can be introduced for reducing the number of measurement points and creating a reliable sound movement. Binaural reproduction can be also performed by loudspeakers. However, the involvement of two or more loudspeakers causes the problem of crosstalk. In this case, crosstalk cancellation (CTC) algorithms are needed to delete unwanted interference signals. In this thesis, starting from a comparative analysis of HRTF measurement techniques, a binaural rendering system based on HRTF interpolation is proposed and evaluated for real-time applications. The proposed method shows good performance in comparison with a reference technique. The interpolation algorithm is also applied for immersive audio rendering over loudspeakers, by adding a fixed crosstalk cancellation algorithm, which assumes that the listener is in a fixed position. In addition, an adaptive crosstalk cancellation system, which includes the tracking of the listener's head, is analyzed and a real-time implementation is presented. The adaptive CTC implements a subband structure and experimental results prove that a higher number of bands improves the performance in terms of total error and convergence rate. The reproduction system and the characteristics of the listening room may affect the performance due to their non-ideal frequency response. Audio equalization is used to adjust the balance of different audio frequencies in order to achieve desired sound characteristics. The equalization can be manual, such as in the case of graphic equalization, where the gain of each frequency band can be modified by the user, or automatic, where the equalization curve is automatically calculated after the room impulse response measurement. The room response equalization can be also applied to multichannel systems, which employ two or more loudspeakers, and the equalization zone can be enlarged by measuring the impulse responses in different points of the listening zone. In this thesis, efficient graphic equalizers (GEQs), and an adaptive room response equalization system are presented. In particular, three low-complexity linear- and quasi-linear-phase graphic equalizers are proposed and deeply examined. Experiments confirm the effectiveness of the proposed GEQs in terms of accuracy, computational complexity, and latency. Successively, a subband adaptive structure is introduced for the development of a multichannel and multiple positions room response equalizer. Experimental results verify the effectiveness of the subband approach in comparison with the single-band case. Finally, a linear-phase crossover network is presented for multichannel systems, showing great results in terms of magnitude flatness, cutoff rates, polar diagram, and phase response. Active noise control (ANC) systems can be designed to reduce the effects of noise pollution and can be used simultaneously with an immersive audio system. The ANC works by creating a sound wave that has an opposite phase with respect to the sound wave of the unwanted noise. The additional sound wave creates destructive interference, which reduces the overall sound level. Finally, this thesis presents an ANC system used for noise reduction. The proposed approach implements an online secondary path estimation and is based on cross-update adaptive filters applied to the primary path estimation that aim at improving the performance of the whole system. The proposed structure allows for a better convergence rate in comparison with a reference algorithm

    Discrete multitone modulation with principal component filter banks

    Get PDF
    Discrete multitone (DMT) modulation is an attractive method for communication over a nonflat channel with possibly colored noise. The uniform discrete Fourier transform (DFT) filter bank and cosine modulated filter bank have in the past been used in this system because of low complexity. We show in this paper that principal component filter banks (PCFB) which are known to be optimal for data compression and denoising applications, are also optimal for a number of criteria in DMT modulation communication. For example, the PCFB of the effective channel noise power spectrum (noise psd weighted by the inverse of the channel gain) is optimal for DMT modulation in the sense of maximizing bit rate for fixed power and error probabilities. We also establish an optimality property of the PCFB when scalar prefilters and postfilters are used around the channel. The difference between the PCFB and a traditional filter bank such as the brickwall filter bank or DFT filter bank is significant for effective power spectra which depart considerably from monotonicity. The twisted pair channel with its bridged taps, next and fext noises, and AM interference, therefore appears to be a good candidate for the application of a PCFB. This is demonstrated with the help of numerical results for the case of the ADSL channel

    Development and applications of adaptive IIR and subband filters

    Get PDF
    Adaptive infinite impulse response (IIR) filter is a challenging research area. Identifiers and Equalizers are among the most essential digital signal processing devices for digital communication systems. In this study, we consider IIR channel both for system identification and channel equalization purposes. We focus on four different approaches: Least Mean Square (LMS), Recursive Least Square (RLS), Genetic Algorithm (GA) and Subband Adaptive Filter (SAF). ). The performance of conventional LMS and RLS based IIR system identification and channel equalization are found with the help of computer simulations. And also the convergence speed and the ability to locate the global optimum solution using a population based algorithm named Genetic Algorithm is given

    Study on Air Interface Variants and their Harmonization for Beyond 5G Systems

    Full text link
    [ES] La estandarización de la Quinta Generación de redes móviles o 5G, ha concluido este año 2020. No obstante, en el año 2014 cuando la ITU empezó el proceso de estandarización IMT-2020, una de las principales interrogantes era cuál sería la forma de onda sobre la cual se construiría la capa física de esta nueva generación de tecnologías. El 3GPP se comprometió a entregar una tecnología candidata al proceso IMT-2020, y es así como dentro de este proceso de deliberación se presentaron varias formas de onda candidatas, las cuales fueron evaluadas en varios aspectos hasta que en el año 2016 el 3GPP tomó una decisión, continuar con CP-OFDM (utilizada en 4G) con numerología flexible. Una vez decidida la forma de onda, el proceso de estandarización continuó afinando la estructura de la trama, y todos los aspectos intrínsecos de la misma. Esta tesis acompañó y participó de todo este proceso. Para empezar, en esta disertación se evaluaron las principales formas de onda candidatas al 5G. Es así que se realizó un análisis teórico de cada forma de onda, destacando sus fortalezas y debilidades, tanto a nivel de implementación como de rendimiento. Posteriormente, se llevó a cabo una implementación real en una plataforma Software Defined Radio de tres de las formas de onda más prometedoras (CP-OFDM, UFMC y OQAM-FBMC), lo que permitió evaluar su rendimiento en términos de la tasa de error por bit, así como la complejidad de su implementación. Esta tesis ha propuesto también el uso de una solución armonizada como forma de onda para el 5G y sostiene que sigue siendo una opción viable para sistemas beyond 5G. Dado que ninguna de las forma de onda candidatas era capaz de cumplir por sí misma con todos los requisitos del 5G, en lugar de elegir una única forma de onda se propuso construir un transceptor que fuese capaz de construir todas las principales formas de onda candidatas (CP-OFDM, P-OFDM, UFMC, QAM-FBMC, OQAM-FBMC). Esto se consiguió identificando los bloques comunes entre las formas de onda, para luego integrarlos junto con el resto de bloques indispensables para cada forma de onda. La motivación para esta solución era tener una capa física que fuese capaz de cumplir con todos los aspectos del 5G, seleccionando siempre la mejor forma de onda según el escenario. Esta propuesta fue evaluada en términos de complejidad, y los resultados se compararon con la complejidad de cada forma de onda. La decisión de continuar con CP-OFDM con numerología flexible como forma de onda para el 5G se puede considerar también como una solución armonizada, ya que al cambiar el prefijo cíclico y el número de subportadoras, cambian también las prestaciones del sistema. En esta tesis se evaluaron todas las numerologías propuestas por el 3GPP sobre cada uno de los modelos de canal descritos para el 5G (y considerados válidos para sistemas beyond 5G), teniendo en cuenta factores como la movilidad de los equipos de usuario y la frecuencia de operación; para esto se utilizó un simulador de capa física del 3GPP, al que se hicieron las debidas adaptaciones con el fin de evaluar el rendimiento de las numerologías en términos de la tasa de error por bloque. Finalmente, se presenta un bosquejo de lo que podría llegar a ser la Sexta Generación de redes móviles o 6G, con el objetivo de entender las nuevas aplicaciones que podrían ser utilizadas en un futuro, así como sus necesidades. Completado el estudio llevado a cabo en esta tesis, se puede afirmar que como se propuso desde un principio la solución, tanto para el 5G como para beyond 5G, la solución es la armonización de las formas de onda. De los resultados obtenidos se puede corroborar que una solución armonizada permite alcanzar un ahorro computacional entre el 25-40% para el transmisor y del 15-25% para el receptor. Además, fue posible identificar qué numerología CP-OFDM es la más adecuada para cada escenario, lo que permitiría optimizar el diseño y despliegue de las redes 5G. Esto abriría la puerta a hacer lo mismo con el 6G, ya que en esta tesis se considera que será necesario abrir nuevamente el debate sobre cuál es la forma de onda adecuada para esta nueva generación de tecnologías, y se plantea que el camino a seguir es optar por una solución armonizada con distintas formas de onda, en lugar de solo una como sucede con el 5G.[CA] L'estandardització de la Quinta Generació de xarxes mòbils o 5G, ha conclòs enguany 2020. No obstant això, l'any 2014 quan la ITU va començar el procés d'estandardització IMT-2020, uns dels principals interrogants era quina seria la forma d'onda sobre la qual es construiria la capa física d'esta nova generació de tecnologies. El 3GPP es va comprometre a entregar una tecnologia candidata al procés IMT-2020, i és així com dins d'este procés de deliberació es van presentar diverses formes d'onda candidates, les quals van ser avaluades en diversos aspectes fins que l'any 2016 el 3GPP va prendre una decisió, continuar amb CP-OFDM (utilitzada en 4G) amb numerología flexible. Una vegada decidida la forma d'onda, el procés d'estandardització va continuar afinant la frame structure (no se m'ocorre nom en espanyol), i tots els aspectes intrínsecs de la mateixa. Esta tesi va acompanyar i va participar de tot este procés. Per a començar, en esta dissertació es van avaluar les principals formes d'onda candidates al 5G. És així que es va realitzar una anàlisi teòrica de cada forma d'onda, destacant les seues fortaleses i debilitats, tant a nivell d'implementació com de rendiment. Posteriorment, es va dur a terme una implementació real en una plataforma Software Defined Radio de tres de les formes d'onda més prometedores (CP-OFDM, UFMC i OQAM-FBMC), la qual cosa va permetre avaluar el seu rendiment en termes de la taxa d'error per bit, així com la complexitat de la seua implementació. Esta tesi ha proposat també l'ús d'una solució harmonitzada com a forma d'onda per al 5G i sosté que continua sent una opció viable per a sistemes beyond 5G. Atés que cap de les forma d'onda candidates era capaç de complir per si mateixa amb tots els requeriments del 5G, en compte de triar una única forma d'onda es va proposar construir un transceptor que fóra capaç de construir totes les principals formes d'onda candidates (CP-OFDM, P-OFDM, UFMC, QAM-FBMC, OQAM-FBMC). Açò es va aconseguir identificant els blocs comuns entre les formes d'onda, per a després integrar-los junt amb la resta de blocs indispensables per a cada forma d'onda. La motivació per a esta solució era tindre una capa física que fóra capaç de complir amb tots els aspectes del 5G, seleccionant sempre la millor forma d'onda segons l'escenari. Esta proposta va ser avaluada en termes de complexitat, i els resultats es van comparar amb la complexitat de cada forma d'onda. La decisió de continuar amb CP-OFDM amb numerología flexible com a forma d'onda per al 5G es pot considerar també com una solució harmonitzada, ja que al canviar el prefix cíclic i el número de subportadores, canvien també les prestacions del sistema. En esta tesi es van avaluar totes les numerologías propostes pel 3GPP sobre cada un dels models de canal descrits per al 5G (i considerats vàlids per a sistemes beyond 5G), tenint en compte factors com la mobilitat dels equips d'usuari i la freqüència d'operació; per a açò es va utilitzar un simulador de capa física del 3GPP, a què es van fer les degudes adaptacions a fi d'avaluar el rendiment de les numerologías en termes de la taxa d'error per bloc. Finalment, es presenta un esbós del que podria arribar a ser la Sexta Generació de xarxes mòbils o 6G, amb l'objectiu d'entendre les noves aplicacions que podrien ser utilitzades en un futur, així com les seues necessitats. Completat l'estudi dut a terme en esta tesi, es pot afirmar que com es va proposar des d'un principi la solució, tant per al 5G com per a beyond 5G, la solució és l'harmonització de les formes d'onda. dels resultats obtinguts es pot corroborar que una solució harmonitzada permet aconseguir un estalvi computacional entre el 25-40% per al transmissor i del 15-25% per al receptor. A més, va ser possible identificar què numerología CP-OFDM és la més adequada per a cada escenari, la qual cosa permetria optimitzar el disseny i desplegament de les xarxes 5G. Açò obriria la porta a fer el mateix amb el 6G, ja que en esta tesi es considera que serà necessari obrir novament el debat sobre quina és la forma d’onda adequada per a esta nova generació de tecnologies, i es planteja que el camí que s’ha de seguir és optar per una solució harmonitzada amb distintes formes d’onda, en compte de només una com succeïx amb el 5G.[EN] The standardization of the Fifth Generation of mobile networks or 5G is still ongoing, although the first releases of the standard were completed two years ago and several 5G networks are up and running in several countries around the globe. However, in 2014 when the ITU began the IMT-2020 standardization process, one of the main questions was which would be the waveform to be used on the physical layer of this new generation of technologies. The 3GPP committed to submit a candidate technology to the IMT-2020 process, and that is how within this deliberation process several candidate waveforms were presented. After a thorough evaluation regarding several aspects, in 2016 the 3GPP decided to continue with CP-OFDM (used in 4G) but including, as a novelty, the use of a flexible numerology. Once the waveform was decided, the standardization process continued to fine-tune the frame structure and all the intrinsic aspects of it. This thesis accompanied and participated in this entire process. To begin with, this dissertation evaluates the main 5G candidate waveforms. Therefore, a theoretical analysis of each waveform is carried out, highlighting its strengths and weaknesses, both at the implementation and performance levels. Subsequently, a real implementation on a Software Defined Radio platform of three of the most promising waveforms (CP-OFDM, UFMC, and OQAM-FBMC) is presented, which allows evaluating their performance in terms of bit error rate, as well as the complexity of its implementation. This thesis also proposes the use of a harmonized solution as a waveform for 5G and argues that it remains a viable option for systems beyond 5G. Since none of the candidate waveforms was capable of meeting on its own with all the requirements for 5G, instead of choosing a single waveform, this thesis proposes to build a transceiver capable of building all the main waveforms candidates (CP-OFDM, P-OFDM, UFMC, QAM-FBMC, OQAM-FBMC). This is achieved by identifying the common blocks between the waveforms and then integrating them with the rest of the essential blocks for each waveform. The motivation for this solution is to have a physical layer that is capable of complying with all aspects of beyond 5G technologies, always selecting the best waveform according to the scenario. This proposal is evaluated in terms of complexity, and the results are compared with the complexity of each waveform. The decision to continue with CP-OFDM with flexible numerology as a waveform for 5G can also be considered as a harmonized solution, since changing the cyclic prefix and the number of subcarriers, changes also the performance of the system. In this thesis, all the numerologies proposed by the 3GPP are evaluated on each of the channel models described for 5G (and considered valid for beyond 5G systems), taking into account factors such as the mobility of the user equipment and the operating frequency. For this, a 3GPP physical layer simulator is used, and proper adaptations are made in order to evaluate the performance of the numerologies in terms of the block error rate. Finally, a sketch of what could become the Sixth Generation of mobile networks or 6G is presented, with the aim of understanding the new applications that could be used in the future, as well as their needs. After the completion of the study carried out in this thesis, it can be said that, as stated from the beginning, for both 5G and beyond 5G systems, the solution is the waveform harmonization. From the results obtained, it can be corroborated that a harmonized solution allows achieving computational savings between 25-40% for the transmitter and 15-25% for the receiver. In addition, it is possible to identify which CP-OFDM numerology is the most appropriate for each scenario, which would allow optimizing the design and deployment of 5G networks. This would open the door to doing the same with 6G, i.e., a harmonized solution with different waveforms, instead of just one as in 5G.Flores De Valgas Torres, FJ. (2020). Study on Air Interface Variants and their Harmonization for Beyond 5G Systems [Tesis doctoral]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/164442TESI

    FPGA Implementation of Hearing Impaired Assistive Device for Hard to Hear Individuals

    Get PDF
    The Noise cancellation and suppression techniques have been developed and implemented in field-programmable gate array (FPGA) in this work. Hearing aids are primarily meant for improving hearing and speech comprehensions. Digital hearing aids score over their analog counterparts. This happens as digital hearing aids provide flexible gain besides facilitating feedback reduction and noise elimination. Recent advances in digital signal processors (DSP) and Microelectronics have led to the development of superior digital hearing aids. Many researchers have investigated several algorithms suitable for hearing aid application that demands low noise, feed-back cancellation, echo cancellation, etc., however the toughest challenge is the implementation. Furthermore, the additional constraints are power and area. The device must consume as minimum power as possible to support extended battery life and should be as small as possible for increased portability. In this work, we are using cross-channel suppression technique to remove the unwanted audio signals. The unwanted signals are suppressed using twotone suppression scheme. In this project, the speech signal is absorbed by microphone. This signal is then converted to digital using ADC. The digitized signal is processed using FPGA. Here in FPGA the speech signal is enhanced and amplified to the desired level. The processed speech signal is then converted into analog format using DAC and is given to speaker
    corecore