Search CORE

98 research outputs found

A Spiking Neural Network Model of the Medial Superior Olive Using Spike Timing Dependent Plasticity for Sound Localization

Author: Glackin Brendan
Maguire Liam P.
McDaid Liam J.
McGinnity Thomas M.
Wall Julie A.
Publication venue: Frontiers Research Foundation
Publication date: 01/01/2010
Field of study

Sound localization can be defined as the ability to identify the position of an input sound source and is considered a powerful aspect of mammalian perception. For low frequency sounds, i.e., in the range 270 Hz–1.5 KHz, the mammalian auditory pathway achieves this by extracting the Interaural Time Difference between sound signals being received by the left and right ear. This processing is performed in a region of the brain known as the Medial Superior Olive (MSO). This paper presents a Spiking Neural Network (SNN) based model of the MSO. The network model is trained using the Spike Timing Dependent Plasticity learning rule using experimentally observed Head Related Transfer Function data in an adult domestic cat. The results presented demonstrate how the proposed SNN model is able to perform sound localization with an accuracy of 91.82% when an error tolerance of ±10° is used. For angular resolutions down to 2.5°, it will be demonstrated how software based simulations of the model incur significant computation times. The paper thus also addresses preliminary implementation on a Field Programmable Gate Array based hardware platform to accelerate system performance

Crossref

Directory of Open Access Journals

PubMed Central

A Spiking Neural Network Model of the Medial Superior Olive using Spike Timing Dependent Plasticity for Sound Localisation

Author: Glackin B.
Glackin B.
Maguire L.P.
Maguire L.P.
McDaid L.J.
McDaid L.J.
McGinnity T.M.
McGinnity T.M.
Wall J.
Wall J.
Publication venue: 'Frontiers in Bioscience'
Publication date: 01/01/2010
Field of study

Sound localization can be defined as the ability to identify the position of an input sound source and is considered a powerful aspect of mammalian perception. For low frequency sounds, i.e., in the range 270 Hz-1.5 KHz, the mammalian auditory pathway achieves this by extracting the Interaural Time Difference between sound signals being received by the left and right ear. This processing is performed in a region of the brain known as the Medial Superior Olive (MSO). This paper presents a Spiking Neural Network (SNN) based model of the MSO. The network model is trained using the Spike Timing Dependent Plasticity learning rule using experimentally observed Head Related Transfer Function data in an adult domestic cat. The results presented demonstrate how the proposed SNN model is able to perform sound localization with an accuracy of 91.82% when an error tolerance of +/-10 degrees is used. For angular resolutions down to 2.5 degrees , it will be demonstrated how software based simulations of the model incur significant computation times. The paper thus also addresses preliminary implementation on a Field Programmable Gate Array based hardware platform to accelerate system performance

UEL Research Repository at University of East London

Calibration of sound source localisation for robots using multiple adaptive filter models of the cerebellum

Author: Baxendale Mark David
Publication venue
Publication date
Field of study

The aim of this research was to investigate the calibration of Sound Source Localisation (SSL) for robots using the adaptive filter model of the cerebellum and how this could be automatically adapted for multiple acoustic environments. The role of the cerebellum has mainly been identified in the context of motor control, and only in recent years has it been recognised that it has a wider role to play in the senses and cognition. The adaptive filter model of the cerebellum has been successfully applied to a number of robotics applications but so far none involving auditory sense. Multiple models frameworks such as MOdular Selection And Identification for Control (MOSAIC) have also been developed in the context of motor control, and this has been the inspiration for adaptation of audio calibration in multiple acoustic environments; again, application of this approach in the area of auditory sense is completely new. The thesis showed that it was possible to calibrate the output of an SSL algorithm using the adaptive filter model of the cerebellum, improving the performance compared to the uncalibrated SSL. Using an adaptation of the MOSAIC framework, and specifically using responsibility estimation, a system was developed that was able to select an appropriate set of cerebellar calibration models and to combine their outputs in proportion to how well each was able to calibrate, to improve the SSL estimate in multiple acoustic contexts, including novel contexts. The thesis also developed a responsibility predictor, also part of the MOSAIC framework, and this improved the robustness of the system to abrupt changes in context which could otherwise have resulted in a large performance error. Responsibility prediction also improved robustness to missing ground truth, which could occur in challenging environments where sensory feedback of ground truth may become impaired, which has not been addressed in the MOSAIC literature, adding to the novelty of the thesis. The utility of the so-called cerebellar chip has been further demonstrated through the development of a responsibility predictor that is based on the adaptive filter model of the cerebellum, rather than the more conventional function fitting neural network used in the literature. Lastly, it was demonstrated that the multiple cerebellar calibration architecture is capable of limited self-organising from a de-novo state, with a predetermined number of models. It was also demonstrated that the responsibility predictor could learn against its model after self-organisation, and to a limited extent, during self-organisation. The thesis addresses an important question of how a robot could improve its ability to listen in multiple, challenging acoustic environments, and recommends future work to develop this ability

UWE Bristol Research Repository

Étude des mécanismes de localisation auditive et de leur plasticité dans le cortex auditif humain

Author: Trapeau Régis E.
Publication venue
Publication date: 01/03/2016
Field of study

Pouvoir déterminer la provenance des sons est fondamental pour bien interagir avec notre environnement. La localisation auditive est une faculté importante et complexe du système auditif humain. Le cerveau doit décoder le signal acoustique pour en extraire les indices qui lui permettent de localiser une source sonore. Ces indices de localisation auditive dépendent en partie de propriétés morphologiques et environnementales qui ne peuvent être anticipées par l'encodage génétique. Le traitement de ces indices doit donc être ajusté par l'expérience durant la période de développement. À l’âge adulte, la plasticité en localisation auditive existe encore. Cette plasticité a été étudiée au niveau comportemental, mais on ne connaît que très peu ses corrélats et mécanismes neuronaux. La présente recherche avait pour objectif d'examiner cette plasticité, ainsi que les mécanismes d'encodage des indices de localisation auditive, tant sur le plan comportemental, qu'à travers les corrélats neuronaux du comportement observé. Dans les deux premières études, nous avons imposé un décalage perceptif de l’espace auditif horizontal à l’aide de bouchons d’oreille numériques. Nous avons montré que de jeunes adultes peuvent rapidement s’adapter à un décalage perceptif important. Au moyen de l’IRM fonctionnelle haute résolution, nous avons observé des changements de l’activité corticale auditive accompagnant cette adaptation, en termes de latéralisation hémisphérique. Nous avons également pu confirmer l’hypothèse de codage par hémichamp comme représentation de l'espace auditif horizontal. Dans une troisième étude, nous avons modifié l’indice auditif le plus important pour la perception de l’espace vertical à l’aide de moulages en silicone. Nous avons montré que l’adaptation à cette modification n’était suivie d’aucun effet consécutif au retrait des moulages, même lors de la toute première présentation d’un stimulus sonore. Ce résultat concorde avec l’hypothèse d’un mécanisme dit de many-to-one mapping, à travers lequel plusieurs profils spectraux peuvent être associés à une même position spatiale. Dans une quatrième étude, au moyen de l’IRM fonctionnelle et en tirant profit de l’adaptation aux moulages de silicone, nous avons révélé l’encodage de l’élévation sonore dans le cortex auditif humain.Spatial hearing is an important but complex capacity of the auditory system. The human auditory system infers the location of a sound source from a variety of acoustic cues, known as auditory localization cues. Because these cues depend to some degree on morphological and environmental factors that cannot be predicted by the genetic makeup, their processing has to fine-tuned during development. Even in adulthood, some plasticity in the processing of localization cues remains. This plasticity has been studied behaviorally, but very little is known about its neural correlates and mechanisms. The present research aimed to investigate this plasticity, as well as the encoding mechanisms of the auditory localization cues, using both behavioral and neuroimaging techniques. In the first two studies, we shifted the perception of horizontal auditory space using digital earplugs. We showed that young adults rapidly adapt to a large perceived shift and that adaptation is accompanied by changes in hemispheric lateralization of auditory cortex activity, as observed with high-resolution functional MRI. We also confirmed the hypothesis of a hemifield code for horizontal sound source location representation in the human auditory cortex. In a third study, we modified the major cue for vertical space perception using silicone earmolds and showed that the adaptation to this modification was not followed by any aftereffect upon earmolds removal, even at the very first sound presentation. This result is consistent with the hypothesis of a “many-to-one mapping” mechanism in which several spectral profiles can become associated with a given spatial direction. In a fourth study, using functional MRI and taking advantage of the adaptation to silicone earmolds, we revealed the encoding of sound source elevation in the human auditory cortex

Dépôt Institutionnel Numérique

Wave Field Synthesis in a listening room

Author: Erbes Vera Dorothee (gnd: 1218729228)
Publication venue: Universität Rostock Rostock
Publication date
Field of study

This thesis investigates the influence of the listening room on sound fields synthesised by Wave Field Synthesis. Methods are developed that allow for investigation of the spatial and timbral perception of Wave Field Synthesis in a reverberant environment using listening experiments based on simulation by binaural synthesis and room acoustical simulation. The results can serve as guidelines for the design of listening rooms for Wave Field Synthesis.Diese Dissertation untersucht den Einfluss des Wiedergaberaums auf Schallfelder, die mit Wellenfeldsynthese synthetisiert werden. Es werden Methoden zur Untersuchung von räumlicher und klangfarblicher Wahrnehmung von Wellenfeldsynthese in einer reflektierenden Umgebung mittels Hörversuchen entwickelt, die auf Simulation mit Binauralsynthese und raumakustischer Simulation beruhen. Die Ergebnisse können als Richtlinien zur Gestaltung von Wiedergaberäumen für Wellenfeldsynthese dienen

Rostocker Dokumentenserver

Aspects of room acoustics, vision and motion in the human auditory perception of space

Author: Schutte Michael
Publication venue: Ludwig-Maximilians-Universität München
Publication date: 12/07/2021
Field of study

The human sense of hearing contributes to the awareness of where sound-generating objects are located in space and of the environment in which the hearing individual is located. This auditory perception of space interacts in complex ways with our other senses, can be both disrupted and enhanced by sound reflections, and includes safety mechanisms which have evolved to protect our lives, but can also mislead us. This dissertation explores some selected topics from this wide subject area, mostly by testing the abilities and subjective judgments of human listeners in virtual environments. Reverberation is the gradually decaying persistence of sounds in an enclosed space which results from repeated sound reflections at surfaces. The first experiment (Chapter 2) compared how strongly people perceived reverberation in different visual situations: when they could see the room and the source which generated the sound; when they could see some room and some sound source, but the image did not match what they heard; and when they could not see anything at all. There were no indications that the visual image had any influence on this aspect of room-acoustical perception. The potential benefits of motion for judging the distance of sound sources were the focus of the second study (Chapter 3), which consists of two parts. In the first part, loudspeakers were placed at different depths in front of sitting listeners who, on command, had to either remain still or move their upper bodies sideways. This experiment demonstrated that humans can exploit motion parallax (the effect that closer objects appear faster to a moving observer than farther objects) with their ears and not just with their eyes. The second part combined a virtualisation of such sound sources with a motion platform to show that the listeners’ interpretation of this auditory motion parallax was better when they performed this lateral movement by themselves, rather than when they were moved by the apparatus or were not actually in motion at all. Two more experiments were concerned with the perception of sounds which are perceived as becoming louder over time. These have been called “looming”, as the source of such a sound might be on a collision course. One of the studies (Chapter 4) showed that western diamondback rattlesnakes (Crotalus atrox) increase the vibration speed of their rattle in response to the approach of a threatening object. It also demonstrated that human listeners perceive (virtual) snakes which engage in this behaviour as especially close, causing them to keep a greater margin of safety than they would otherwise. The other study (section 5.6) was concerned with the well-known looming bias of the sound localisation system, a phenomenon which leads to a sometimes exaggerated, sometimes more accurate perception of approaching compared to receding sounds. It attempted to find out whether this bias is affected by whether listeners hear such sounds in a virtual enclosed space or in an environment with no sound reflections. While the results were inconclusive, this experiment is noteworthy as a proof of concept: It was the first study to make use of a new real-time room-acoustical simulation system, liveRAZR, which was developed as part of this dissertation (Chapter 5). Finally, while humans have been more often studied for their unique abilities to communicate with each other and bats for their extraordinary capacity to locate objects by sound, this dissertation turns this setting of priorities on its head with the last paper (Chapter 6): Based on recordings of six pale spear-nosed bats (Phyllostomus discolor), it is a survey of the identifiably distinct vocalisations observed in their social interactions, along with a description of the different situations in which they typically occur.Das menschliche Gehör trägt zum Bewusstsein dafür bei, wo sich schallerzeugende Objekte im Raum befinden und wie die Umgebung beschaffen ist, in der sich eine Person aufhält. Diese auditorische Raumwahrnehmung interagiert auf komplexe Art und Weise mit unseren anderen Sinnen, kann von Schallreflektionen sowohl profitieren als auch durch sie behindert werden, und besitzt Mechanismen welche evolutionär entstanden sind, um unser Leben zu schützen, uns aber auch irreführen können. Diese Dissertation befasst sich mit einigen ausgewählten Themen aus diesem weiten Feld und stützt sich dabei meist auf die Testung von Wahrnehmungsfähigkeiten und subjektiver Einschätzungen menschlicher Hörer/-innen in virtueller Realität. Beim ersten Experiment (Kapitel 2) handelte es sich um einen Vergleich zwischen der Wahrnehmung von Nachhall, dem durch wiederholte Reflexionen an Oberflächen hervorgerufenen, sukzessiv abschwellenden Verbleib von Schall in einem umschlossenen Raum, unter verschiedenen visuellen Umständen: wenn die Versuchsperson den Raum und die Schallquelle sehen konnte; wenn sie irgendeinen Raum und irgendeine Schallquelle sehen konnte, dieses Bild aber vom Schalleindruck abwich; und wenn sie gar kein Bild sehen konnte. Dieser Versuch konnte keinen Einfluss eines Seheindrucks auf diesen Aspekt der raumakustischen Wahrnehmung zu Tage fördern. Mögliche Vorteile von Bewegung für die Einschätzung der Entfernung von Schallquellen waren der Schwerpunkt der zweiten Studie (Kapitel 3). Diese bestand aus zwei Teilen, wovon der erste zeigte, dass Hörer/-innen, die ihren Oberkörper relativ zu zwei in unterschiedlichen Abständen vor ihnen aufgestellten Lautsprechern auf Kommando entweder stillhalten oder seitlich bewegen mussten, im letzteren Falle von der Bewegungsparallaxe (dem Effekt, dass sich der nähere Lautsprecher relativ zum sich bewegenden Körper schneller bewegte als der weiter entfernte) profitieren konnten. Der zweite Teil kombinierte eine Simulation solcher Schallquellen mit einer Bewegungsplattform, wodurch gezeigt werden konnte, dass die bewusste Eigenbewegung für die Versuchspersonen hilfreicher war, als durch die Plattform bewegt zu werden oder gar nicht wirklich in Bewegung zu sein. Zwei weitere Versuche gingen auf die Wahrnehmung von Schallen ein, deren Ursprungsort sich nach und nach näher an den/die Hörer/-in heranbewegte. Derartige Schalle werden auch als „looming“ („anbahnend“) bezeichnet, da eine solche Annäherung bei bedrohlichen Signalen nichts Gutes ahnen lässt. Einer dieser Versuche (Kapitel 4) zeigte zunächst, dass Texas-Klapperschlangen (Crotalus atrox) die Vibrationsgeschwindigkeit der Schwanzrassel steigern, wenn sich ein bedrohliches Objekt ihnen nähert. Menschliche Hörer/-innen nahmen (virtuelle) Schlangen, die dieses Verhalten aufweisen, als besonders nahe wahr und hielten einen größeren Sicherheitsabstand ein, als sie es sonst tun würden. Der andere Versuch (Abschnitt 5.6) versuchte festzustellen, ob die wohlbekannte Neigung unserer Schallwahrnehmung, näherkommende Schalle manchmal übertrieben und manchmal genauer einzuschätzen als sich entfernende, durch Schallreflektionen beeinflusst werden kann. Diese Ergebnisse waren unschlüssig, jedoch bestand die Besonderheit dieses Versuchs darin, dass er erstmals ein neues Echtzeitsystem zur Raumakustiksimulation (liveRAZR) nutzte, welches als Teil dieser Dissertation entwickelt wurde (Kapitel 5). Abschließend (Kapitel 6) wird die Schwerpunktsetzung auf den Kopf gestellt, nach der Menschen öfter auf ihre einmaligen Fähigkeiten zur Kommunikation miteinander untersucht werden und Fledermäuse öfter auf ihre außergewöhnliches Geschick, Objekte durch Schall zu orten: Anhand von Aufnahmen von sechs Kleinen Lanzennasen (Phyllostomus discolor) fasst das Kapitel die klar voneinander unterscheidbaren Laute zusammen, die diese Tiere im sozialen Umgang miteinander produzieren, und beschreibt, in welchen Situationen diese Lauttypen typischerweise auftreten

Digitale Hochschulschriften der LMU

Using auditory augmented reality to understand visual scenes

Author: Stone Scott
University of Lethbridge. Faculty of Arts and Science
Publication venue: Department of Neuroscience
Publication date: 01/01/2017
Field of study

Locating objects in space is typically thought of as a visual task. However, not everyone has access to visual information, such as the blind. The purpose of this thesis was to investigate whether it was possible to convert visual events into spatial auditory cues. A neuromorphic retina was used to collect visual events and custom software was written to augment auditory localization cues into the scene. The neuromorphic retina is engineered to encode data similar to how the dorsal visual pathway does. The dorsal visual pathway is associated with fast nonredundant information encoding and is thought to drive attentional shifting, especially in the presence of visual transients. The intent was to create a device capable of using these visual onsets and transients to generate spatial auditory cues. To achieve this, the device uses the core principles driving auditory localization, with a focus on the interaural time and level difference cues. These cues are thought to be responsible for encoding azimuthal location in space. Results demonstrate the usefulness of such a device, but personalization will probably improve the effectiveness of the cues generated. In summary, I have created a device that converts purely visual events into useful auditory cues for localization, thereby granting perception of stimuli that may have been inaccessible to the user

OPUS: Open Uleth Scholarship - University of Lethbridge Research Repository

Auditory fitness for duty: localising small arms gunfire

Author: Bevis Zoe
Publication venue
Publication date: 01/03/2016
Field of study

Locating the source of small arms fire is deemed a mission-critical auditory task by infantry personnel (Bevis et al. 2014; Semeraro et al. 2015). Little is known about the acoustic localisation cues within a gunshot and human ability to localise gunshots. Binaural recordings of ‘live’ gunshots from an SA80 rifle were obtained using a KEMAR dummy head placed 100 m from the firer, within 30 cm of the bullet trajectory and with 13 azimuth angles from 90° left to 90º right. The ‘crack’, created by the supersonic bullet passing the target, produced smaller interaural time and level differences than the ‘thump’, created by the muzzle blast, for the rifle at the same angle. Forty normal-hearing listeners (20 civilian, 20 military personnel) and 12 hearing impaired listeners (all military personnel) completed a virtual azimuthal localisation task using three stimuli created from the recordings (whole gunshot, ‘crack’ only and ‘thump’ only) plus a 50 ms broadband noise burst convolved with KEMAR impulse responses. All listeners localised all stimuli types above chance level. Average localisation error increased in the order of: noise burst < thump < gunshot < crack, for all cohorts. Military personnel (regardless of their hearing level) performed significantly worse than civilians for all stimuli; they had a higher tendency to select the extreme left and right sources, resulting in an increased lateral bias. The difference between military and civilian participants may be due to their understanding of the task or military training/experience. Mild to moderate bilateral symmetrical sensorineural hearing loss did not have a significant impact on localisation accuracy. This suggests that, providing the gunshot is clearly audible and audiometric thresholds are equal between the ears, binaural cues will still be accessible and localisation accuracy will be preserved. Further work is recommended to investigate the relationship between other hearing loss configurations and small arms gunshot localisation accuracy before considering gunshot localisation as a measure of auditory fitness for infantry personnel

Southampton (e-Prints Soton)

Sensitivity to interaural timing differences within the envelopes of acoustic waveforms

Author: Greenberg DL
Publication venue: UCL (University College London)
Publication date: 28/04/2014
Field of study

Interaural-timing-differences (ITDs) are a cue for sound-source localisation and can be conveyed in the temporal-fine-structure (TFS) of low-frequency tones or in the envelope of high-frequency, amplitude-modulated sounds such as sinusoidally amplitude-modulated (SAM) and transposed-tones. Sensitivity to these cues has been measured in human psychophysical experiments and has revealed that the tranposed-tone elicits just-noticeable-differences (JNDs) in ITDs that are equivalent to those of low-frequency pure-tones when the modulation frequency is below 512-Hz. At modulation frequencies above 512-Hz performance rapidly declines for the transposed-tone while sensitivity to ITDs in pure-tones is robust until around 1200-Hz. Furthermore, transposed-tones elicit JNDs smaller than SAM tones. In the present study, ITD JNDs are assessed psychophysically for pure-tones and transposed-tones using off-midline reference locations. The results demonstrate that frequency, whether the ITD is conveyed in the TFS or the envelope, and location, all have a significant effect on human ITD JNDs and suggest that a difference exists in how ITDs are coded neuronally when conveyed by either high- or low-frequency sounds. ITD-sensitive neurons located within several brainstem nuclei display a high degree of phase-locking to both the TFS of low-frequency pure-tones and the envelopes of SAM and transposed-tones. Echoing the psychophysical findings, phase-locking to the waveform envelope at low modulation frequencies is equivalent to that of low-frequency pure-tones, while declining at high rates of modulation to a lesser degree for tranposed-tones than SAM tones. In order to assess factors critical to the localisation of high-frequency sounds a series of electrophysiology experiments were conducted. Recordings were made from single neurons within the inferior colliculus of the guinea pig in response to ITDs conveyed by 18 unique envelope shapes to evaluate how the envelope segments; Pause, Attack, Sustain and Decay each effect ITD JNDs. Amplitude-modulations with envelope shapes comprising relatively long Pause but short Attack durations have been found to elicit the greatest ITD discrimination of high-frequency sounds

UCL Discovery

Role of The Cochlea and Efferent System in Children with Auditory Processing Disorder

Author: Boothalingam Sriram
Publication venue: Scholarship@Western
Publication date: 22/08/2014
Field of study

Auditory processing disorder (APD) is characterized by difficulty listening in noisy environments despite normal hearing thresholds. APD was previously thought to be restricted to deficits in the central auditory system. The current work sought to investigate brainstem and peripheral mechanisms that may contribute to difficulties in speech understanding in noise in children with suspected APD (sAPD). Three mechanisms in particular were investigated: cochlear tuning, efferent function, and spatial hearing. Cochlear tuning was measured using stimulus frequency otoacoustic emission (SFOAE) group delay. Results indicate that children suspected with APD have atypically sharp cochlear tuning, and reduced medial olivocochlear (MOC) functioning. Sharper-than-typical cochlear tuning may lead to increased forward masking. On the contrary, binaural efferent function probed with a forward masked click evoked OAE (CEOAE) paradigm indicated that MOC function was not different in typically developing (TD) children and children suspected with APD. A third study with multiple OAE types sought to address this contradiction. Despite numerically smaller MOC inhibition in the sAPD group, MOC function was not significantly different between the two groups. Finally, spatial release from masking, localization-in-noise and interaural time difference thresholds were compared in TD and children with sAPD. Results indicate no significant difference in spatial hearing abilities between the two groups. Non-significant findings at group level in these studies may be related to the large heterogeneity in problems associated with APD. Fragmentation of APD into deficit specific disorders may facilitate research in identification of the specific anatomical underpinnings to listening problems in APD. Prior to conducting studies in children, three studies were conducted to optimize stimulus characteristics. Results of these studies indicate that the MOC may not be especially sensitive to 100 Hz amplitude modulation, as previously reported. Click stimulus presentation rates \u3e25 Hz activate the ipsilateral MOC reflex in typical MOC assays, contaminating contralateral MOC inhibition of CEOAEs. Finally, localization-in-noise abilities of TD children are on par with adults for a white noise masker, but not for speech-babble. This finding suggests that despite maturation of physiological mechanisms required to localize in noise, non-auditory factors may restrict the ability of children in processing complex signals

Scholarship@Western