18 research outputs found

    Graafinen ekvalisointi taajuusvarpattujen digitaalisten suotimien avulla

    Get PDF
    The aim of this thesis is to design a graphic equalizer with frequency warped digital filters. The proposed design consists of a warped FIR filter for the low frequency bands and a standard FIR filter for the high frequency bands. This de- sign is used to implement both an octave and a one-third octave equalizer in Matlab. Low frequency equalization with FIR filters requires high filter orders. The frequency resolution of the lowest band of the graphic equalizer requires filter orders that are impractical for real life applications. With frequency warping filter orders can be lowered, so that a practical graphic equalizer can be designed. With this design common gain build-up problems, which are present in most of the IIR designs, can be avoided. The proposed equalizer design is found to be accurate and comparable to the previous equalizer designs. Filter orders required are small enough to this design to be used in real life applications. The gain build-up problem is avoided in this design, as several equalizer bands are filtered with a single filter. The computational costs of the design are higher than the costs of the other compared designs. However, the difference can be smaller if the accuracy restrictions are lowered.Tämän työn tavoitteena on suunnitella graafinen ekvalisaattori taajuusvarpattujen digitaalisten suotimien avulla. Ehdotettu ekvalisaattorimalli koostuu taajuusvarpatusta ja tavallisesta FIR suotimesta. Varpattua suodinta käytetään alimpien taajuuskaistojen suodattamiseen ja tavallista FIR suodinta ylimpien kaistojen suodattamiseen. Tätä mallia käytetään sekä oktaavi- että terssikaista-ekvalisaattorien totetutamiseen Matlabilla. Matalien taajuuksien ekvalisointi edellyttää korkeaa astelukua FIR suotimilta. Alimpien taajuuskaistojen taajuusresoluutio edellyttää astelukuja, jotka ovat epäkäytännöllisiä tosielämän sovelluksissa. Taajuusvarppauksella suotimien astelukuja voidaan pienentää, jolloin graafinen ekvalisaattori voidaan toteuttaa käytännössä. Tällä mallilla voidaan välttää IIR ekvalisaattorien yleinen ongelma, jossa ekvalisaattorien kaistojen vahvistus vaikuttaa viereisiin kaistoihin. Ehdotettu ekvalisaattorimalli todetaan olevan tarkka ja vertailukelpoinen aikaisempien toteutuksien kanssa. Suotimien asteluvut ovat tarpeeksi pieniä, jotta tätä mallia voidaan käyttää tosielämän toteutuksissa. Kaistojen välinen vaikutus vältetään tällä mallilla, sillä useampi kaista suodatetaan yhdellä suotimella. Laskennallinen kuorma on tällä toteutuksella suurempi kuin muilla vertailluilla toteutuksilla. Eroa voidaan pienentää, jos ekvalisaattorin tarkkuusvaatimuksia lasketaan

    Adaptiiviset lÀpikuuluvuuskuulokkeet

    Get PDF
    Hear-through equalization can be used to make a headset acoustically transparent, i.e.~to produce sound perception that is similar to perception without the headset. The headset must have microphones outside the earpieces to capture the ambient sounds, which is then reproduced with the headset transducers after the equalization. The reproduced signal is called the hear-through signal. Equalization is needed, since the headset affects the acoustics of the outer ear. \\ In addition to the external microphones, the headset used in this study has additional internal microphones. Together these microphones can be used to estimate the attenuation of the headset online and to detect poor fit. Since the poor fit causes leaks and decreased attenuation, the combined effect of the leaked sound and the hear-through signal changes, when compared to proper fit situation. Therefore, the isolation estimate is used to control the hear-through equalization in order to produce better acoustical transparency. Furthermore, the proposed adaptive hear-through algorithm includes manual controls for the equalizers and the volume of the hear-through signal. \\ The proposed algorithm is found to transform the used headset acoustically transparent. The equalization controls improve the performance of the headset, when the fit is poor or when the volume of the hear-through signal is adjusted, by reducing the comb-filtering effect due to the summation of the leaked sound and the hear-through signal inside the ear canal. The behavior of the proposed algorithm can be demonstrated with an implemented Matlab simulator.LÀpikuuluvuusekvalisoinnilla voidaan saavuttaa akustinen lÀpinÀkyvyys kuulokkeita kÀytettÀessÀ, eli tuottaa samankaltainen ÀÀniaistimus kuin mikÀ havaittaisiin ilman kuulokkeita. KÀytetyissÀ kuulokkeissa tulee olla mikrofonit kuulokkeen ulkopinnalla, joiden avulla voidaan tallentaa ympÀröiviÀ ÀÀniÀ. Mikrofonisignaalit ekvalisoidaan, jolloin niistÀ tulee lÀpikuuluvuussignaalit, ja toistetaan kuulokkeista. Ekvalisointi on tarpeellista, sillÀ kuulokkeet muuttavat ulkokorvan akustiikka ja siten myös ÀÀnihavaintoa. \\ TÀssÀ diplomityössÀ kÀytetyssÀ prototyyppikuulokeparissa on edellÀ mainittujen mikrofonien lisÀksi myös toiset, korvakÀytÀvÀn sisÀlle asettuvat mikrofonit. YhdessÀ nÀiden kahden mikrofonin avulla voidaan mÀÀrittÀÀ reaaliaikainen estimaatti kuulokkeen vaimennukselle ja tunnistaa huono istuvuus. Koska huonosti asetettu kuuloke vuotaa enemmÀn ÀÀntÀ korvakÀytÀvÀn sisÀÀn kuin kunnolla asetettu, kuulokkeen ÀÀnen ja vuotavan ÀÀnen yhteisvaikutus muuttuu. TÀstÀ syystÀ vaimennusestimaattia kÀytetÀÀn lÀpikuuluvuusekvalisoinnin sÀÀtöön, jotta akustinen lÀpinÀkyvyys ei kÀrsisi. LisÀksi esitellyssÀ algoritmissa on manuaaliset sÀÀdöt ekvalisaattoreille ja lÀpikuuluvuussignaalin voimakkuudelle.\\ Esitetyn algoritmin havaitaan tuottavan akustinen lÀpinÀkyvyys, kun sitÀ kÀytetÀÀn prototyyppikuulokkeiden kanssa. EkvalisointisÀÀdöt parantavat kuulokkeiden toimintaa istuvuuden ollessa huono tai sÀÀdettÀessÀ lÀpikuuluvuussignaalin voimakkuutta, koska ne vÀhentÀvÀt kampasuodatusefektiÀ, joka voi aiheutua vuotavan ÀÀnen ja lÀpikuuluvuussignaalin summautuessa. Esitellyn algoritmin toimintaa voidaan havainnollistaa toteutetulla Matlab-simulaattorilla

    Generalized linear-in-parameter models : theory and audio signal processing applications

    Get PDF
    This thesis presents a mathematically oriented perspective to some basic concepts of digital signal processing. A general framework for the development of alternative signal and system representations is attained by defining a generalized linear-in-parameter model (GLM) configuration. The GLM provides a direct view into the origins of many familiar methods in signal processing, implying a variety of generalizations, and it serves as a natural introduction to rational orthonormal model structures. In particular, the conventional division between finite impulse response (FIR) and infinite impulse response (IIR) filtering methods is reconsidered. The latter part of the thesis consists of audio oriented case studies, including loudspeaker equalization, musical instrument body modeling, and room response modeling. The proposed collection of IIR filter design techniques is submitted to challenging modeling tasks. The most important practical contribution of this thesis is the introduction of a procedure for the optimization of rational orthonormal filter structures, called the BU-method. More generally, the BU-method and its variants, including the (complex) warped extension, the (C)WBU-method, can be consider as entirely new IIR filter design strategies.reviewe

    A room acoustics measurement system using non-invasive microphone arrays

    Get PDF
    This thesis summarises research into adaptive room correction for small rooms and pre-recorded material, for example music of films. A measurement system to predict the sound at a remote location within a room, without a microphone at that location was investigated. This would allow the sound within a room to be adaptively manipulated to ensure that all listeners received optimum sound, therefore increasing their enjoyment. The solution presented used small microphone arrays, mounted on the room's walls. A unique geometry and processing system was designed, incorporating three processing stages, temporal, spatial and spectral. The temporal processing identifies individual reflection arrival times from the recorded data. Spatial processing estimates the angles of arrival of the reflections so that the three-dimensional coordinates of the reflections' origin can be calculated. The spectral processing then estimates the frequency response of the reflection. These estimates allow a mathematical model of the room to be calculated, based on the acoustic measurements made in the actual room. The model can then be used to predict the sound at different locations within the room. A simulated model of a room was produced to allow fast development of algorithms. Measurements in real rooms were then conducted and analysed to verify the theoretical models developed and to aid further development of the system. Results from these measurements and simulations, for each processing stage are presented

    An investigation of the utility of monaural sound source separation via nonnegative matrix factorization applied to acoustic echo and reverberation mitigation for hands-free telephony

    Get PDF
    In this thesis we investigate the applicability and utility of Monaural Sound Source Separation (MSSS) via Nonnegative Matrix Factorization (NMF) for various problems related to audio for hands-free telephony. We first investigate MSSS via NMF as an alternative acoustic echo reduction approach to existing approaches such as Acoustic Echo Cancellation (AEC). To this end, we present the single-channel acoustic echo problem as an MSSS problem, in which the objective is to extract the users signal from a mixture also containing acoustic echo and noise. To perform separation, NMF is used to decompose the near-end microphone signal onto the union of two nonnegative bases in the magnitude Short Time Fourier Transform domain. One of these bases is for the spectral energy of the acoustic echo signal, and is formed from the in- coming far-end user’s speech, while the other basis is for the spectral energy of the near-end speaker, and is trained with speech data a priori. In comparison to AEC, the speaker extraction approach obviates Double-Talk Detection (DTD), and is demonstrated to attain its maximal echo mitigation performance immediately upon initiation and to maintain that performance during and after room changes for similar computational requirements. Speaker extraction is also shown to introduce distortion of the near-end speech signal during double-talk, which is quantified by means of a speech distortion measure and compared to that of AEC. Subsequently, we address Double-Talk Detection (DTD) for block-based AEC algorithms. We propose a novel block-based DTD algorithm that uses the available signals and the estimate of the echo signal that is produced by NMF-based speaker extraction to compute a suitably normalized correlation-based decision variable, which is compared to a fixed threshold to decide on doubletalk. Using a standard evaluation technique, the proposed algorithm is shown to have comparable detection performance to an existing conventional block-based DTD algorithm. It is also demonstrated to inherit the room change insensitivity of speaker extraction, with the proposed DTD algorithm generating minimal false doubletalk indications upon initiation and in response to room changes in comparison to the existing conventional DTD. We also show that this property allows its paired AEC to converge at a rate close to the optimum. Another focus of this thesis is the problem of inverting a single measurement of a non- minimum phase Room Impulse Response (RIR). We describe the process by which percep- tually detrimental all-pass phase distortion arises in reverberant speech filtered by the inverse of the minimum phase component of the RIR; in short, such distortion arises from inverting the magnitude response of the high-Q maximum phase zeros of the RIR. We then propose two novel partial inversion schemes that precisely mitigate this distortion. One of these schemes employs NMF-based MSSS to separate the all-pass phase distortion from the target speech in the magnitude STFT domain, while the other approach modifies the inverse minimum phase filter such that the magnitude response of the maximum phase zeros of the RIR is not fully compensated. Subjective listening tests reveal that the proposed schemes generally produce better quality output speech than a comparable inversion technique

    An investigation of the utility of monaural sound source separation via nonnegative matrix factorization applied to acoustic echo and reverberation mitigation for hands-free telephony

    Get PDF
    In this thesis we investigate the applicability and utility of Monaural Sound Source Separation (MSSS) via Nonnegative Matrix Factorization (NMF) for various problems related to audio for hands-free telephony. We first investigate MSSS via NMF as an alternative acoustic echo reduction approach to existing approaches such as Acoustic Echo Cancellation (AEC). To this end, we present the single-channel acoustic echo problem as an MSSS problem, in which the objective is to extract the users signal from a mixture also containing acoustic echo and noise. To perform separation, NMF is used to decompose the near-end microphone signal onto the union of two nonnegative bases in the magnitude Short Time Fourier Transform domain. One of these bases is for the spectral energy of the acoustic echo signal, and is formed from the in- coming far-end user’s speech, while the other basis is for the spectral energy of the near-end speaker, and is trained with speech data a priori. In comparison to AEC, the speaker extraction approach obviates Double-Talk Detection (DTD), and is demonstrated to attain its maximal echo mitigation performance immediately upon initiation and to maintain that performance during and after room changes for similar computational requirements. Speaker extraction is also shown to introduce distortion of the near-end speech signal during double-talk, which is quantified by means of a speech distortion measure and compared to that of AEC. Subsequently, we address Double-Talk Detection (DTD) for block-based AEC algorithms. We propose a novel block-based DTD algorithm that uses the available signals and the estimate of the echo signal that is produced by NMF-based speaker extraction to compute a suitably normalized correlation-based decision variable, which is compared to a fixed threshold to decide on doubletalk. Using a standard evaluation technique, the proposed algorithm is shown to have comparable detection performance to an existing conventional block-based DTD algorithm. It is also demonstrated to inherit the room change insensitivity of speaker extraction, with the proposed DTD algorithm generating minimal false doubletalk indications upon initiation and in response to room changes in comparison to the existing conventional DTD. We also show that this property allows its paired AEC to converge at a rate close to the optimum. Another focus of this thesis is the problem of inverting a single measurement of a non- minimum phase Room Impulse Response (RIR). We describe the process by which percep- tually detrimental all-pass phase distortion arises in reverberant speech filtered by the inverse of the minimum phase component of the RIR; in short, such distortion arises from inverting the magnitude response of the high-Q maximum phase zeros of the RIR. We then propose two novel partial inversion schemes that precisely mitigate this distortion. One of these schemes employs NMF-based MSSS to separate the all-pass phase distortion from the target speech in the magnitude STFT domain, while the other approach modifies the inverse minimum phase filter such that the magnitude response of the maximum phase zeros of the RIR is not fully compensated. Subjective listening tests reveal that the proposed schemes generally produce better quality output speech than a comparable inversion technique

    Physics-based models for the acoustic representation of space in virtual environments

    Get PDF
    In questo lavoro sono state affrontate alcune questioni inserite nel tema pi\uf9 generale della rappresentazione di scene e ambienti virtuali in contesti d\u2019interazione uomo-macchina, nei quali la modalit\ue0 acustica costituisca parte integrante o prevalente dell\u2019informazione complessiva trasmessa dalla macchina all\u2019utilizzatore attraverso un\u2019interfaccia personale multimodale oppure monomodale acustica. Pi\uf9 precisamente \ue8 stato preso in esame il problema di come presentare il messaggio audio, in modo tale che lo stesso messaggio fornisca all\u2019utilizzatore un\u2019informazione quanto pi\uf9 precisa e utilizzabile relativamente al contesto rappresentato. Il fine di tutto ci\uf2 \ue8 riuscire a integrare all\u2019interno di uno scenario virtuale almeno parte dell\u2019informazione acustica che lo stesso utilizzatore, in un contesto stavolta reale, normalmente utilizza per trarre esperienza dal mondo circostante nel suo complesso. Ci\uf2 \ue8 importante soprattutto quando il focus dell\u2019attenzione, che tipicamente impegna il canale visivo quasi completamente, \ue8 volto a un compito specifico.This work deals with the simulation of virtual acoustic spaces using physics-based models. The acoustic space is what we perceive about space using our auditory system. The physical nature of the models means that they will present spatial attributes (such as, for example, shape and size) as a salient feature of their structure, in a way that space will be directly represented and manipulated by means of them

    Proceedings of the EAA Spatial Audio Signal Processing symposium: SASP 2019

    Get PDF
    International audienc

    Audio for Virtual, Augmented and Mixed Realities: Proceedings of ICSA 2019 ; 5th International Conference on Spatial Audio ; September 26th to 28th, 2019, Ilmenau, Germany

    Get PDF
    The ICSA 2019 focuses on a multidisciplinary bringing together of developers, scientists, users, and content creators of and for spatial audio systems and services. A special focus is on audio for so-called virtual, augmented, and mixed realities. The fields of ICSA 2019 are: - Development and scientific investigation of technical systems and services for spatial audio recording, processing and reproduction / - Creation of content for reproduction via spatial audio systems and services / - Use and application of spatial audio systems and content presentation services / - Media impact of content and spatial audio systems and services from the point of view of media science. The ICSA 2019 is organized by VDT and TU Ilmenau with support of Fraunhofer Institute for Digital Media Technology IDMT
    corecore