    Low-delay nonuniform pseudo-QMF banks with application to speech enhancement

    Journal ArticleAbstract-This paper presents a method for designing low-delay nonuniform pseudo quadrature mirror filter (QMF) banks. This method is motivated by the work of Li, Nguyen, and Tantaratana, in which the nonuniform filter bank is realized by combining an appropriate number of adjacent sub-bands of a uniform pseudo-QMF bank. In prior work, the prototype filter of the uniform pseudo-QMF bank was constrained to have linear phase and the overall delay associated with the filter bank was often unacceptably large for filter banks with a large number of sub-bands. This paper proposes a pseudo-QMF filter bank design technique that significantly reduces the delay by relaxing the linear phase constraints. An example in which an oversampled critical-band nonuniform filter bank is designed and applied to a two-state modeling speech enhancement system is presented in this paper. Comparison of the performance of this system to competing methods employing tree-structured, linear phase multiresolution analysis indicates that the approach described in this paper strikes a good balance between system performance and low delay

    An Iterative Design with Variable Step Prototype Filter for Cosine Modulated Filter Bank

    A systematic and self controlled prototype filter design approach for multichannel Cosine Modulated Near Perfect Reconstruction (NPR) filter bank is proposed in this paper. The primary goal is to design a prototype filter with enhanced performance i.e., minimum amplitude distortion and aliasing error. This algorithm approximates 3dB cutoff frequency very close to π/2M. This is achieved by selecting suitable step size which is a function of transition width. If the selection of step size is too fine, the objective function oscillates. Whereas, if step size is coarse, 3dB cutoff frequency will not be close to π/2M. This will degrade the overall performance of the prototype filter. Thus by choosing the step size as a function of transition width and varying the step size from coarser to finer level, the minimum amplitude distortion and aliasing error can be definitely achieved. The proposed filter is designed using two input parameters: number of subbands M and attenuation A and all other system parameters are derived from it to avoid heuristic inputs. Simulation results indicate better performance with reference to algorithms existing in literature. In addition, the design approach is systematic and self controlled

    On optimal design and applications of linear transforms

    Linear transforms are encountered in many fields of applied science and engineering. In the past, conventional block transforms provided acceptable answers to different practical problems. But now, under increasing competitive pressures, with the growing reservoir of theory and a corresponding development of computing facilities, a real demand has been created for methods that systematically improve performance. As a result the past two decades have seen the explosive growth of a class of linear transform theory known as multiresolution signal decomposition. The goal of this work is to design and apply these advanced signal processing techniques to several different problems. The optimal design of subband filter banks is considered first. Several design examples are presented for M-band filter banks. Conventional design approaches are found to present problems when the number of constraints increases. A novel optimization method is proposed using a step-by-step design of a hierarchical subband tree. This method is shown to possess performance improvements in applications such as subband image coding. The subband tree structuring is then discussed and generalized algorithms are presented. Next, the attention is focused on the interference excision problem in direct sequence spread spectrum (DSSS) communications. The analytical and experimental performance of the DSSS receiver employing excision are presented. Different excision techniques are evaluated and ranked along with the proposed adaptive subband transform-based excises. The robustness of the considered methods is investigated for either time-localized or frequency-localized interferers. A domain switchable excision algorithm is also presented. Finally, sonic of the ideas associated with the interference excision problem are utilized in the spectral shaping of a particular biological signal, namely heart rate variability. The improvements for the spectral shaping process are shown for time-frequency analysis. In general, this dissertation demonstrates the proliferation of new tools for digital signal processing

    Wavelet Filter Banks in Perceptual Audio Coding

    This thesis studies the application of the wavelet filter bank (WFB) in perceptual audio coding by providing brief overviews of perceptual coding, psychoacoustics, wavelet theory, and existing wavelet coding algorithms. Furthermore, it describes the poor frequency localization property of the WFB and explores one filter design method, in particular, for improving channel separation between the wavelet bands. A wavelet audio coder has also been developed by the author to test the new filters. Preliminary tests indicate that the new filters provide some improvement over other wavelet filters when coding audio signals that are stationary-like and contain only a few harmonic components, and similar results for other types of audio signals that contain many spectral and temporal components. It has been found that the WFB provides a flexible decomposition scheme through the choice of the tree structure and basis filter, but at the cost of poor localization properties. This flexibility can be a benefit in the context of audio coding but the poor localization properties represent a drawback. Determining ways to fully utilize this flexibility, while minimizing the effects of poor time-frequency localization, is an area that is still very much open for research

    Epälineaarisen signaaliriippuvan akustisen keilanmuodostajan reaaliaikaimplementaatio

    A real-time acoustical beamforming system incorporating the cross pattern coherence (CroPaC) post filtering method is implemented in this thesis. The real-time implementation consists of a signal-independent beamformer that is used for spatial discrimination of a sound field. The signal of the beamformer is post filtered by modulating it with a parameter that is derived from the cross-spectrum of two directional microphone signals. The post filter is implemented to enhance performance of beamforming (increase in signal-to-noise ratio), because beamformers are not efficient in environments with high level of reverberation. The post filtering method has been previously implemented in MATLAB for non-real-time use, and this system is the first real-time implementation of an acoustical beamforming system utilizing it. The implementation is programmed in the programming language C for the graphical signal processing program Max developed by Cycling '74. It utilizes a time-frequency domain processing, and the spherical Fourier transform for a decomposition of a sound field into spherical harmonic signals. The implementation can be used with microphone arrays with maximum of 32 microphone capsules, which are laid over rigid sphere with uniform or nearly-uniform arrangements. The real-time implementation can be utilized in many applications, which require algorithm to work in real-time, such as teleconferencing and acoustical cameras.Tässä diplomityössä implementoidaan reaaliaikainen akustinen keilanmuodostusjärjestelmä signaalien väliseen koherenssiin perustuvalla (CroPaC) jälkisuodatuksella. Reaaliaikaimplementaatio koostuu signaaliriippumattomasta keilanmuodostajasta, jota käytetään äänikentän spatiaaliseen suodatukseen. Keilanmuodostajan signaalia jälkisuodatetaan moduloimalla sitä parametrilla, joka johdetaan kahden suuntamikrofonin signaalin välisestä koherenssista. Jälkisuodatus implementoidaan keilanmuodostajan suorituskyvyn parantamiseksi (signaali-kohina-suhteen kasvu), sillä keilanmuodostajat eivät ole tehokkaita kaiuntaisissa ympäristöissä. Jälkisuodatusmetodi on aikaisemmin implementoitu MATLABissa ei-reaaliaikakäyttöä varten. Tämän työn implementaatio on ensimmäinen reaaliaikainen akustinen keilanmuodostusjärjestelmä, joka hyödyntää CroPaC-jälkisuodatusta. Implementaatio on ohjelmoitu C-ohjelmointikielellä graafiselle signaalinprosessointityökalulle Max, jonka on kehittänyt Cycling '74. Prosessointi tapahtuu aika-taajuustasossa ja siinä hyödynnetään äänikentän dekompositiota palloharmonisiin signaaleihin. Implementaatiota voidaan käyttää mikrofoniryhmällä, jossa on korkeintaan 32 mikrofonikapselia, jotka on asetettu jäykän pallon päälle tasavälein tai lähes tasavälein. Reaaliaikaimplementaatiota voidaan hyödyntää lukuisissa sovelluksissa, jotka edellyttävät algoritmin reaaliaikaista toimintaa, esimerkiksi puhelinkokouksissa ja akustisissa kameroissa

    Real-time Microphone Array Processing for Sound-field Analysis and Perceptually Motivated Reproduction

    This thesis details real-time implementations of sound-field analysis and perceptually motivated reproduction methods for visualisation and auralisation purposes. For the former, various methods for visualising the relative distribution of sound energy from one point in space are investigated and contrasted; including a novel reformulation of the cross-pattern coherence (CroPaC) algorithm, which integrates a new side-lobe suppression technique. Whereas for auralisation applications, listening tests were conducted to compare ambisonics reproduction with a novel headphone formulation of the directional audio coding (DirAC) method. The results indicate that the side-lobe suppressed CroPaC method offers greater spatial selectivity in reverberant conditions compared with other popular approaches, and that the new DirAC formulation yields higher perceived spatial accuracy when compared to the ambisonics method

    Filter Optimization for Personal Sound Zones Systems

    [ES] Los sistemas de zonas de sonido personal (o sus siglas en inglés PSZ) utilizan altavoces y técnicas de procesado de señal para reproducir sonidos distintos en diferentes zonas de un mismo espacio compartido. Estos sistemas se han popularizado en los últimos años debido a la amplia gama de aplicaciones que podrían verse beneficiadas por la generación de zonas de escucha individuales. El diseño de los filtros utilizados para procesar las señales de sonido es uno de los aspectos más importantes de los sistemas PSZ, al menos para las frecuencias bajas y medias. En la literatura se han propuesto diversos algoritmos para calcular estos filtros, cada uno de ellos con sus ventajas e inconvenientes. En el presente trabajo se revisan los algoritmos para sistemas PSZ propuestos en la literatura y se evalúa experimentalmente su rendimiento en un entorno reverberante. Los distintos algoritmos se comparan teniendo en cuenta aspectos como el aislamiento acústico entre zonas, el error de reproducción, la energía de los filtros y el retardo del sistema. Además, se estudian estrategias computacionalmente eficientes para obtener los filtros y también se compara su complejidad computacional. Los resultados experimentales obtenidos revelan que las soluciones existentes no pueden ofrecer una complejidad computacional baja y al mismo tiempo un buen rendimiento con baja latencia. Por ello se propone un nuevo algoritmo basado en el filtrado subbanda, y se demuestra experimentalmente que este algoritmo mitiga las limitaciones de los algoritmos existentes. Asimismo, este algoritmo ofrece una mayor versatilidad que los algoritmos existentes, ya que se pueden utilizar configuraciones distintas en cada subbanda, como por ejemplo, diferentes longitudes de filtro o distintos conjuntos de altavoces. Por último, se estudia la influencia de las respuestas objetivo en la optimización de los filtros y se propone un nuevo método en el que se aplica una ventana temporal a estas respuestas. El método propuesto se evalúa experimentalmente en dos salas con diferentes tiempos de reverberación y los resultados obtenidos muestran que se puede reducir la energía de las interferencias entre zonas gracias al efecto de la ventana temporal.[CA] Els sistemes de zones de so personal (o les seves sigles en anglés PSZ) fan servir altaveus i tècniques de processament de senyal per a reproduir sons distints en diferents zones d'un mateix espai compartit. Aquests sistemes s'han popularitzat en els últims anys a causa de l'àmplia gamma d'aplicacions que podrien veure's beneficiades per la generació de zones d'escolta individuals. El disseny dels filtres utilitzats per a processar els senyals de so és un dels aspectes més importants dels sistemes PSZ, particularment per a les freqüències baixes i mitjanes. En la literatura s'han proposat diversos algoritmes per a calcular aquests filtres, cadascun d'ells amb els seus avantatges i inconvenients. En aquest treball es revisen els algoritmes proposats en la literatura per a sistemes PSZ i s'avalua experimentalment el seu rendiment en un entorn reverberant. Els distints algoritmes es comparen tenint en compte aspectes com l'aïllament acústic entre zones, l'error de reproducció, l'energia dels filtres i el retard del sistema. A més, s'estudien estratègies de còmput eficient per obtindre els filtres i també es comparen les seves complexitats computacionals. Els resultats experimentals obtinguts revelen que les solucions existents no poder oferir al mateix temps una complexitat computacional baixa i un bon rendiment amb latència baixa. Per això es proposa un nou algoritme basat en el filtrat subbanda que mitiga aquestes limitacions. A més, l'algoritme proposat ofereix una major versatilitat que els algoritmes existents, ja que en cada subbanda el sistema pot utilitzar configuracions diferents, com per exemple, distintes longituds de filtre o distints conjunts d'altaveus. L'algoritme proposat s'avalua experimentalment en un entorn reverberant, i es mostra com pot mitigar satisfactòriament les limitacions dels algoritmes existents. Finalment, s'estudia la influència de les respostes objectiu en l'optimització dels filtres i es proposa un nou mètode en el que s'aplica una finestra temporal a les respostes objectiu. El mètode proposat s'avalua experimentalment en dues sales amb diferents temps de reverberació i els resultats obtinguts mostren que es pot reduir el nivell d'interferència entre zones grècies a l'efecte de la finestra temporal.[EN] Personal Sound Zones (PSZ) systems deliver different sounds to a number of listeners sharing an acoustic space through the use of loudspeakers together with signal processing techniques. These systems have attracted a lot of attention in recent years because of the wide range of applications that would benefit from the generation of individual listening zones, e.g., domestic or automotive audio applications. A key aspect of PSZ systems, at least for low and mid frequencies, is the optimization of the filters used to process the sound signals. Different algorithms have been proposed in the literature for computing those filters, each exhibiting some advantages and disadvantages. In this work, the state-of-the-art algorithms for PSZ systems are reviewed, and their performance in a reverberant environment is evaluated. Aspects such as the acoustic isolation between zones, the reproduction error, the energy of the filters, and the delay of the system are considered in the evaluations. Furthermore, computationally efficient strategies to obtain the filters are studied, and their computational complexity is compared too. The performance and computational evaluations reveal the main limitations of the state-of-the-art algorithms. In particular, the existing solutions can not offer low computational complexity and at the same time good performance for short system delays. Thus, a novel algorithm based on subband filtering that mitigates these limitations is proposed for PSZ systems. In addition, the proposed algorithm offers more versatility than the existing algorithms, since different system configurations, such as different filter lengths or sets of loudspeakers, can be used in each subband. The proposed algorithm is experimentally evaluated and tested in a reverberant environment, and its efficacy to mitigate the limitations of the existing solutions is demonstrated. Finally, the effect of the target responses in the optimization is discussed, and a novel approach that is based on windowing the target responses is proposed. The proposed approach is experimentally evaluated in two rooms with different reverberation levels. The evaluation results reveal that an appropriate windowing of the target responses can reduce the interference level between zones.Molés Cases, V. (2022). Filter Optimization for Personal Sound Zones Systems [Tesis doctoral]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/18611