29 research outputs found

    Efficient time delay estimation and compensation applied to the cancellation of acoustic echo

    Get PDF
    The system identification problem is notably dealt with using adaptive filtering approaches. In many applications the unknown system response consists of an initial sequence of zero-valued coefficients that precedes the active part of the response. The presence of these coefficients introduces a flat delay in the incoming signals which can take significantly large values. When most adaptive approaches attempt to model such a system, the presence of flat delay impairs their operation and performance. The approach introduced in this thesis aims to model the flat delay and active part of the unknown system separately. An efficient system for time delay estimation (TDE) is introduced to estimate the flat delay of an unknown system. The estimated delay is then compensated within the adaptive system thus allowing the latter to cover the active part ofthe unknown system. The proposed system is applied to the Acoustic Echo Cancellation (ABC) problem

    Neural architecture for echo suppression during sound source localization based on spiking neural cell models

    Get PDF
    Zusammenfassung Diese Arbeit untersucht die biologischen Ursachen des psycho-akustischen Präzedenz Effektes, der Menschen in die Lage versetzt, akustische Echos während der Lokalisation von Schallquellen zu unterdrücken. Sie enthält ein Modell zur Echo-Unterdrückung während der Schallquellenlokalisation, welches in technischen Systemen zur Mensch-Maschine Interaktion eingesetzt werden kann. Die Grundlagen dieses Modells wurden aus eigenen elektrophysiologischen Experimenten an der Mongolischen Wüstenrennmaus gewonnen. Die dabei erstmalig an der Wüstenrennmaus erzielten Ergebnisse, zeigen ein besonderes Verhalten spezifischer Zellen im Dorsalen Kern des Lateral Lemniscus, einer dedizierten Region des auditorischen Hirnstammes. Die dort sichtbare Langzeithemmung scheint die Grundlage für die Echounterdrückung in höheren auditorischen Zentren zu sein. Das entwickelte Model war in der Lage dieses Verhalten nachzubilden, und legt die Vermutung nahe, dass eine starke und zeitlich präzise Hyperpolarisation der zugrundeliegende physiologische Mechanismus dieses Verhaltens ist. Die entwickelte Neuronale Modellarchitektur modelliert das Innenohr und fünf wesentliche Kerne des auditorischen Hirnstammes in ihrer Verbindungsstruktur und internen Dynamik. Sie stellt einen neuen Typus neuronaler Modellierung dar, der als Spike-Interaktionsmodell (SIM) bezeichnet wird. SIM nutzen die präzise räumlich-zeitliche Interaktion einzelner Aktionspotentiale (Spikes) für die Kodierung und Verarbeitung neuronaler Informationen. Die Basis dafür bilden Integrate-and-Fire Neuronenmodelle sowie Hebb'sche Synapsen, welche um speziell entwickelte dynamische Kernfunktionen erweitert wurden. Das Modell ist in der Lage, Zeitdifferenzen von 10 mykrosekunden zu detektieren und basiert auf den Prinzipien der zeitlichen und räumlichen Koinzidenz sowie der präzisen lokalen Inhibition. Es besteht ausschließlich aus Elementen einer eigens entwickelten Neuronalen Basisbibliothek (NBL) die speziell für die Modellierung verschiedenster Spike- Interaktionsmodelle entworfen wurde. Diese Bibliothek erweitert die kommerziell verfügbare dynamische Simulationsumgebung von MATLAB/SIMULINK um verschiedene Modelle von Neuronen und Synapsen, welche die intrinsischen dynamischen Eigenschaften von Nervenzellen nachbilden. Die Nutzung dieser Bibliothek versetzt sowohl den Ingenieur als auch den Biologen in die Lage, eigene, biologisch plausible, Modelle der neuronalen Informationsverarbeitung ohne detaillierte Programmierkenntnisse zu entwickeln. Die grafische Oberfläche ermöglicht strukturelle sowie parametrische Modifikationen und ist in der Lage, den Zeitverlauf mikroskopischer Zellpotentiale aber auch makroskopischer Spikemuster während und nach der Simulation darzustellen. Zwei grundlegende Elemente der Neuronalen Basisbibliothek wurden zur Implementierung als spezielle analog-digitale Schaltungen vorbereitet. Erste Silizium Implementierungen durch das Team des DFG Graduiertenkollegs GRK 164 konnten die Möglichkeit einer vollparallelen on line Verarbeitung von Schallsignalen nachweisen. Durch Zuhilfenahme des im GRK entwickelten automatisierten Layout Generators wird es möglich, spezielle Prozessoren zur Anwendung biologischer Verarbeitungsprinzipien in technischen Systemen zu entwickeln. Diese Prozessoren unterscheiden sich grundlegend von den klassischen von Neumann Prozessoren indem sie räumlich und zeitlich verteilte Spikemuster, anstatt sequentieller binärer Werte zur Informationsrepräsentation nutzen. Sie erweitern das digitale Kodierungsprinzip durch die Dimensionen des Raumes (2 dimensionale Nachbarschaft) der Zeit (Frequenz, Phase und Amplitude) sowie der zeitlichen Dynamik analoger Potentialverläufe. Diese Dissertation besteht aus sieben Kapiteln, welche den verschiedenen Bereichen der Computational Neuroscience gewidmet sind. Kapitel 1 beschreibt die Motivation dieser Arbeit welche aus der Absicht rühren, biologische Prinzipien der Schallverarbeitung zu erforschen und für technische Systeme während der Interaktion mit dem Menschen nutzbar zu machen. Zusätzlich werden fünf Gründe für die Nutzung von Spike-Interaktionsmodellen angeführt sowie deren neuartiger Charakter beschrieben. Kapitel 2 führt die biologischen Prinzipien der Schallquellenlokalisation und den psychoakustischen Präzedenz Effekt ein. Aktuelle Hypothesen zur Entstehung dieses Effektes werden anhand ausgewählter experimenteller Ergebnisse verschiedener Forschungsgruppen diskutiert. Kapitel 3 beschreibt die entwickelte Neuronale Basisbibliothek und führt die einzelnen neuronalen Simulationselemente ein. Es erklärt die zugrundeliegenden mathematischen Funktionen der dynamischen Komponenten und beschreibt deren generelle Einsetzbarkeit zur dynamischen Simulation spikebasierter Neuronaler Netzwerke. Kapitel 4 enthält ein speziell entworfenes Modell des auditorischen Hirnstammes beginnend mit den Filterkaskaden zur Simulation des Innenohres, sich fortsetzend über mehr als 200 Zellen und 400 Synapsen in 5 auditorischen Kernen bis zum Richtungssensor im Bereich des auditorischen Mittelhirns. Es stellt die verwendeten Strukturen und Parameter vor und enthält grundlegende Hinweise zur Nutzung der Simulationsumgebung. Kapitel 5 besteht aus drei Abschnitten, wobei der erste Abschnitt die Experimentalbedingungen und Ergebnisse der eigens durchgeführten Tierversuche beschreibt. Der zweite Abschnitt stellt die Ergebnisse von 104 Modellversuchen zur Simulationen psycho-akustischer Effekte dar, welche u.a. die Fähigkeit des Modells zur Nachbildung des Präzedenz Effektes testen. Schließlich beschreibt der letzte Abschnitt die Ergebnisse der 54 unter realen Umweltbedingungen durchgeführten Experimente. Dabei kamen Signale zur Anwendung, welche in normalen sowie besonders stark verhallten Räumen aufgezeichnet wurden. Kapitel 6 vergleicht diese Ergebnisse mit anderen biologisch motivierten und technischen Verfahren zur Echounterdrückung und Schallquellenlokalisation und führt den aktuellen Status der Hardwareimplementierung ein. Kapitel 7 enthält schließlich eine kurze Zusammenfassung und einen Ausblick auf weitere Forschungsobjekte und geplante Aktivitäten. Diese Arbeit möchte zur Entwicklung der Computational Neuroscience beitragen, indem sie versucht, in einem speziellen Anwendungsfeld die Lücke zwischen biologischen Erkenntnissen, rechentechnischen Modellen und Hardware Engineering zu schließen. Sie empfiehlt ein neues räumlich-zeitliches Paradigma der dynamischen Informationsverarbeitung zur Erschließung biologischer Prinzipien der Informationsverarbeitung für technische Anwendungen.This thesis investigates the biological background of the psycho-acoustical precedence effect, enabling humans to suppress echoes during the localization of sound sources. It provides a technically feasible and biologically plausible model for sound source localization under echoic conditions, ready to be used by technical systems during man-machine interactions. The model is based upon own electro-physiological experiments in the mongolian gerbil. The first time in gerbils obtained results reveal a special behavior of specific cells of the dorsal nucleus of the lateral lemniscus (DNLL) - a distinct region in the auditory brainstem. The explored persistent inhibition effect of these cells seems to account for the base of echo suppression at higher auditory centers. The developed model proved capable to duplicate this behavior and suggests, that a strong and timely precise hyperpolarization is the basic mechanism behind this cell behavior. The developed neural architecture models the inner ear as well as five major nuclei of the auditory brainstem in their connectivity and intrinsic dynamics. It represents a new type of neural modeling described as Spike Interaction Models (SIM). SIM use the precise spatio-temporal interaction of single spike events for coding and processing of neural information. Their basic elements are Integrate-and-Fire Neurons and Hebbian synapses, which have been extended by specially designed dynamic transfer functions. The model is capable to detect time differences as small as 10 mircrosecondes and employs the principles of coincidence detection and precise local inhibition for auditory processing. It consists exclusively of elements of a specifically designed Neural Base Library (NBL), which has been developed for multi purpose modeling of Spike Interaction Models. This library extends the commercially available dynamic simulation environment of MATLAB/SIMULINK by different models of neurons and synapses simulating the intrinsic dynamic properties of neural cells. The usage of this library enables engineers as well as biologists to design their own, biologically plausible models of neural information processing without the need for detailed programming skills. Its graphical interface provides access to structural as well as parametric changes and is capable to display the time course of microscopic cell parameters as well as macroscopic firing pattern during simulations and thereafter. Two basic elements of the Neural Base Library have been prepared for implementation by specialized mixed analog-digital circuitry. First silicon implementations were realized by the team of the DFG Graduiertenkolleg GRK 164 and proved the possibility of fully parallel on line processing of sounds. By using the automated layout processor under development in the Graduiertenkolleg, it will be possible to design specific processors in order to apply theprinciples of distributed biological information processing to technical systems. These processors differ from classical von Neumann processors by the use of spatio temporal spike pattern instead of sequential binary values. They will extend the digital coding principle by the dimensions of space (spatial neighborhood), time (frequency, phase and amplitude) as well as the dynamics of analog potentials and introduce a new type of information processing. This thesis consists of seven chapters, dedicated to the different areas of computational neuroscience. Chapter 1: provides the motivation of this study arising from the attempt to investigate the biological principles of sound processing and make them available to technical systems interacting with humans under real world conditions. Furthermore, five reasons to use spike interaction models are given and their novel characteristics are discussed. Chapter 2: introduces the biological principles of sound source localization and the precedence effect. Current hypothesis on echo suppression and the underlying principles of the precedence effect are discussed by reference to a small selection of physiological and psycho-acoustical experiments. Chapter 3: describes the developed neural base library and introduces each of the designed neural simulation elements. It also explains the developed mathematical functions of the dynamic compartments and describes their general usage for dynamic simulation of spiking neural networks. Chapter 4: introduces the developed specific model of the auditory brainstem, starting from the filtering cascade in the inner ear via more than 200 cells and 400 synapses in five auditory regions up to the directional sensor at the level of the auditory midbrain. It displays the employed parameter sets and contains basic hints for the set up and configuration of the simulation environment. Chapter 5: consists of three sections, whereas the first one describes the set up and results of the own electro-physiological experiments. The second describes the results of 104 model simulations, performed to test the models ability to duplicate psycho-acoustical effects like the precedence effect. Finally, the last section of this chapter contains the results of 54 real world experiments using natural sound signals, recorded under normal as well as highly reverberating conditions. Chapter 6: compares the achieved results to other biologically motivated and technical models for echo suppression and sound source localization and introduces the current status of silicon implementation. Chapter 7: finally provides a short summary and an outlook toward future research subjects and areas of investigation. This thesis aims to contribute to the field of computational neuroscience by bridging the gap between biological investigation, computational modeling and silicon engineering in a specific field of application. It suggests a new spatio-temporal paradigm of information processing in order to access the capabilities of biological systems for technical applications

    Quality of experience in telemeetings and videoconferencing: a comprehensive survey

    Get PDF
    Telemeetings such as audiovisual conferences or virtual meetings play an increasingly important role in our professional and private lives. For that reason, system developers and service providers will strive for an optimal experience for the user, while at the same time optimizing technical and financial resources. This leads to the discipline of Quality of Experience (QoE), an active field originating from the telecommunication and multimedia engineering domains, that strives for understanding, measuring, and designing the quality experience with multimedia technology. This paper provides the reader with an entry point to the large and still growing field of QoE of telemeetings, by taking a holistic perspective, considering both technical and non-technical aspects, and by focusing on current and near-future services. Addressing both researchers and practitioners, the paper first provides a comprehensive survey of factors and processes that contribute to the QoE of telemeetings, followed by an overview of relevant state-of-the-art methods for QoE assessment. To embed this knowledge into recent technology developments, the paper continues with an overview of current trends, focusing on the field of eXtended Reality (XR) applications for communication purposes. Given the complexity of telemeeting QoE and the current trends, new challenges for a QoE assessment of telemeetings are identified. To overcome these challenges, the paper presents a novel Profile Template for characterizing telemeetings from the holistic perspective endorsed in this paper

    Experiential Perspectives on Sound and Music for Virtual Reality Technologies

    Get PDF
    This thesis examines the intersection of sound, music, and virtuality within current and next-generation virtual reality technologies, with a specific focus on exploring the experiential perspectives of users and participants within virtual experiences. The first half of the thesis constructs a new theoretical model for examining intersections of sound and virtual experience. In Chapter 1, a new framework for virtual experience is constructed consisting of three key elements: virtual hardware (e.g., displays, speakers); virtual software (e.g., rules and systems of interaction); and virtual externalities (i.e., physical spaces used for engaging in virtual experiences). Through using and applying this new model, methodical examinations of complex virtual experiences are possible. Chapter 2 examines the second axis of the thesis through constructing an understanding of how sound is designed, implemented, and received within virtual reality. The concept of soundscapes is explored in the context of experiential perspectives, serving as a useful approach for describing received auditory phenomena. Auditory environments are proposed as a new model for exploring how auditory phenomena can be broadcast to audiences. Chapter 3 explores how inauthenticity within sound can impact users in virtual experience and uses authenticity to critically examine challenges surrounding sound in virtual reality. Constructions of authenticity in music performance are used to illustrate how authenticity is constructed within virtual experience. Chapter 4 integrates music into the understanding of auditory phenomena constructed throughout the thesis: music is rarely part of the created world in a virtual experience. Rather, it is typically something which only the audience – as external observers of the created world – can hear. Therefore, music within immersive virtual reality may be challenging as the audience is placed within the created world.The second half of this thesis uses this theoretical model to consider contemporary and future approaches to virtual experiences. Chapter 5 constructs a series of case studies to demonstrate the use of the framework as a trans-medial and intra/inter-contextual tool of analysis. Through use of the framework, varying approaches to implementation of sound and music in virtual reality technologies are considered, which reveals trans-medial commonalities of immersion and engagement with virtual experiences through sound. Chapter 6 examines near-future technologies, including brain-computer interfaces and other full-immersion technologies, to identify key issues in the design and implementation of future virtual experiences and suggest how interdisciplinary collaboration may help to develop solutions to these issues. Chapter 7 considers how the proposed model for virtuality might allow for methodical examination of similar issues within other fields, such as acoustics and architecture, and examines the ethical considerations that may become relevant as virtual technology develops within the 21st Century.This research explores and rationalises theoretical models of virtuality and sound. This permits designers and developers to improve the implementation of sound and music in virtual experiences for the purpose of improving user outcomes.<br/

    Understanding hearing aid sound quality for music-listening

    Get PDF
    To improve speech intelligibility for individuals with hearing loss, hearing aids amplify speech using gains derived from evidence-based prescriptive methods, in addition to other advanced signal processing mechanisms. While the evidence supports the use of hearing aid signal processing for speech intelligibility, these signal processing adjustments can also be detrimental to hearing aid sound quality, with poor hearing aid sound quality cited as a barrier to device adoption. Poor sound quality is also of concern for music-listening, in which intelligibility is likely not a consideration. A series of electroacoustic and behavioural studies were conducted to study sound quality issues in hearing aids, with a focus on music. An objective sound quality metric was validated for real hearing aid fittings, enabling researchers to predict sound quality impacts of signal processing adjustments. Qualitative interviews with hearing aid user musicians revealed that users’ primary concern was understanding the conductor’s speech during rehearsals, with hearing aid music sound quality issues a secondary concern. However, reported sound quality issues were consistent with music-listening sound quality complaints in the literature. Therefore, follow-up experiments focused on sound quality issues. An examination of different manufacturers’ hearing aids revealed significant music sound quality preferences for some devices over others. Electroacoustic measurements on these devices revealed that bass content varied more between devices than levels in other spectral ranges or nonlinearity, and increased bass levels were most associated with improved sound quality ratings. In a sound quality optimization study, listeners increased the bass and reduced the treble relative to typically-prescribed gains, for both speech and music. However, adjustments were smaller in magnitude for speech compared to music because they were also associated with a decline in speech intelligibility. These findings encourage the increase of bass and reduction of treble to improve hearing aid music sound quality, but only to the degree that speech intelligibility is not compromised. Future research is needed on the prediction of hearing aid music quality, the provision of low-frequency gain in open-fit hearing aids, genre-specific adjustments, hearing aid compression and music, and direct-to-consumer technology

    Sonic Interactions in Virtual Environments

    Get PDF
    This open access book tackles the design of 3D spatial interactions in an audio-centered and audio-first perspective, providing the fundamental notions related to the creation and evaluation of immersive sonic experiences. The key elements that enhance the sensation of place in a virtual environment (VE) are: Immersive audio: the computational aspects of the acoustical-space properties of Virutal Reality (VR) technologies Sonic interaction: the human-computer interplay through auditory feedback in VE VR systems: naturally support multimodal integration, impacting different application domains Sonic Interactions in Virtual Environments will feature state-of-the-art research on real-time auralization, sonic interaction design in VR, quality of the experience in multimodal scenarios, and applications. Contributors and editors include interdisciplinary experts from the fields of computer science, engineering, acoustics, psychology, design, humanities, and beyond. Their mission is to shape an emerging new field of study at the intersection of sonic interaction design and immersive media, embracing an archipelago of existing research spread in different audio communities and to increase among the VR communities, researchers, and practitioners, the awareness of the importance of sonic elements when designing immersive environments

    Sonic interactions in virtual environments

    Get PDF
    This book tackles the design of 3D spatial interactions in an audio-centered and audio-first perspective, providing the fundamental notions related to the creation and evaluation of immersive sonic experiences. The key elements that enhance the sensation of place in a virtual environment (VE) are: Immersive audio: the computational aspects of the acoustical-space properties of Virutal Reality (VR) technologies Sonic interaction: the human-computer interplay through auditory feedback in VE VR systems: naturally support multimodal integration, impacting different application domains Sonic Interactions in Virtual Environments will feature state-of-the-art research on real-time auralization, sonic interaction design in VR, quality of the experience in multimodal scenarios, and applications. Contributors and editors include interdisciplinary experts from the fields of computer science, engineering, acoustics, psychology, design, humanities, and beyond. Their mission is to shape an emerging new field of study at the intersection of sonic interaction design and immersive media, embracing an archipelago of existing research spread in different audio communities and to increase among the VR communities, researchers, and practitioners, the awareness of the importance of sonic elements when designing immersive environments

    Sonic Interactions in Virtual Environments

    Get PDF

    Flow control of real-time unicast multimedia applications in best-effort networks

    Get PDF
    One of the fastest growing segments of Internet applications are real-time mul- timedia applications, like Voice over Internet Protocol (VoIP). Real-time multimedia applications use the User Datagram Protocol (UDP) as the transport protocol because of the inherent conservative nature of the congestion avoidance schemes of Transmis- sion Control Protocol (TCP). The e®ects of uncontrolled °ows on the Internet have not yet been felt because UDP tra±c frequently constitutes only » 20% of the total Internet tra±c. It is pertinent that real-time multimedia applications become better citizens of the Internet, while at the same time deliver acceptable Quality of Service (QoS). Traditionally, packet losses and the increase in the end-to-end delay experienced by some of the packets characterizes congestion in the network. These two signals have been used to develop most known °ow control schemes. The current research considers the °ow accumulation in the network as the signal for use in °ow control. The most signi¯cant contribution of the current research is to propose novel end- to-end °ow control schemes for unicast real-time multimedia °ows transmitting over best-e®ort networks. These control schemes are based on predictive control of the accumulation signal. The end-to-end control schemes available in the literature are based on reactive control that do not take into account the feedback delay existing between the sender and the receiver nor the forward delay in the °ow dynamics. The performance of the proposed control schemes has been evaluated using the ns-2 simulation environment. The research concludes that active control of hard real- time °ows delivers the same or somewhat better QoS as High Bit Rate (HBR, no control), but with a lower average bit rate. Consequently, it helps reduce bandwidth use of controlled real-time °ows by anywhere between 31:43% to 43:96%. Proposed reactive control schemes deliver good QoS. However, they do not scale up as well as the predictive control schemes. Proposed predictive control schemes are e®ective in delivering good quality QoS while using up less bandwidth than even the reactive con- trol schemes. They scale up well as more real-time multimedia °ows start employing them

    Sonic Interactions in Virtual Environments

    Get PDF
    This open access book tackles the design of 3D spatial interactions in an audio-centered and audio-first perspective, providing the fundamental notions related to the creation and evaluation of immersive sonic experiences. The key elements that enhance the sensation of place in a virtual environment (VE) are: Immersive audio: the computational aspects of the acoustical-space properties of Virutal Reality (VR) technologies Sonic interaction: the human-computer interplay through auditory feedback in VE VR systems: naturally support multimodal integration, impacting different application domains Sonic Interactions in Virtual Environments will feature state-of-the-art research on real-time auralization, sonic interaction design in VR, quality of the experience in multimodal scenarios, and applications. Contributors and editors include interdisciplinary experts from the fields of computer science, engineering, acoustics, psychology, design, humanities, and beyond. Their mission is to shape an emerging new field of study at the intersection of sonic interaction design and immersive media, embracing an archipelago of existing research spread in different audio communities and to increase among the VR communities, researchers, and practitioners, the awareness of the importance of sonic elements when designing immersive environments
    corecore