Search CORE

519 research outputs found

Beamforming applied to psychoacoustics - sound source localization based on psychoacoustic attributes and efficient auralization of 3D sound fields

Author: Song Woo-keun
Publication venue: Aalborg Universitet
Publication date: 01/01/2008
Field of study

VBN

Perception of Reverberation in Domestic and Automotive Environments

Author: Kaplanis Neofytos
Publication venue: Aalborg Universitetsforlag
Publication date: 01/12/2016
Field of study

nrpages: 227status: publishe

Lirias

VBN

Application of sound source separation methods to advanced spatial audio systems

Author: Cobos Serrano Máximo
Publication venue: 'Universitat Politecnica de Valencia'
Publication date: 03/12/2010
Field of study

This thesis is related to the field of Sound Source Separation (SSS). It addresses the development and evaluation of these techniques for their application in the resynthesis of high-realism sound scenes by means of Wave Field Synthesis (WFS). Because the vast majority of audio recordings are preserved in twochannel stereo format, special up-converters are required to use advanced spatial audio reproduction formats, such as WFS. This is due to the fact that WFS needs the original source signals to be available, in order to accurately synthesize the acoustic field inside an extended listening area. Thus, an object-based mixing is required. Source separation problems in digital signal processing are those in which several signals have been mixed together and the objective is to find out what the original signals were. Therefore, SSS algorithms can be applied to existing two-channel mixtures to extract the different objects that compose the stereo scene. Unfortunately, most stereo mixtures are underdetermined, i.e., there are more sound sources than audio channels. This condition makes the SSS problem especially difficult and stronger assumptions have to be taken, often related to the sparsity of the sources under some signal transformation. This thesis is focused on the application of SSS techniques to the spatial sound reproduction field. As a result, its contributions can be categorized within these two areas. First, two underdetermined SSS methods are proposed to deal efficiently with the separation of stereo sound mixtures. These techniques are based on a multi-level thresholding segmentation approach, which enables to perform a fast and unsupervised separation of sound sources in the time-frequency domain. Although both techniques rely on the same clustering type, the features considered by each of them are related to different localization cues that enable to perform separation of either instantaneous or real mixtures.Additionally, two post-processing techniques aimed at improving the isolation of the separated sources are proposed. The performance achieved by several SSS methods in the resynthesis of WFS sound scenes is afterwards evaluated by means of listening tests, paying special attention to the change observed in the perceived spatial attributes. Although the estimated sources are distorted versions of the original ones, the masking effects involved in their spatial remixing make artifacts less perceptible, which improves the overall assessed quality. Finally, some novel developments related to the application of time-frequency processing to source localization and enhanced sound reproduction are presented.Cobos Serrano, M. (2009). Application of sound source separation methods to advanced spatial audio systems [Tesis doctoral no publicada]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/8969Palanci

RiuNet

Spatial Multizone Soundfield Reproduction Design

Author: Jin Wenyu
Publication venue: 'Victoria University of Wellington Library'
Publication date: 01/01/2015
Field of study

It is desirable for people sharing a physical space to access different multimedia information streams simultaneously. For a good user experience, the interference of the different streams should be held to a minimum. This is straightforward for the video component but currently difficult for the audio sound component. Spatial multizone soundfield reproduction, which aims to provide an individual sound environment to each of a set of listeners without the use of physical isolation or headphones, has drawn significant attention of researchers in recent years. The realization of multizone soundfield reproduction is a conceptually challenging problem as currently most of the soundfield reproduction techniques concentrate on a single zone. This thesis considers the theory and design of a multizone soundfield reproduction system using arrays of loudspeakers in given complex environments. We first introduce a novel method for spatial multizone soundfield reproduction based on describing the desired multizone soundfield as an orthogonal expansion of formulated basis functions over the desired reproduction region. This provides the theoretical basis of both 2-D (height invariant) and 3-D soundfield reproduction for this work. We then extend the reproduction of the multizone soundfield over the desired region to reverberant environments, which is based on the identification of the acoustic transfer function (ATF) from the loudspeaker over the desired reproduction region using sparse methods. The simulation results confirm that the method leads to a significantly reduced number of required microphones for an accurate multizone sound reproduction compared with the state of the art, while it also facilitates the reproduction over a wide frequency range. In addition, we focus on the improvements of the proposed multizone reproduction system with regard to practical implementation. The so-called 2.5D multizone oundfield reproduction is considered to accurately reproduce the desired multizone soundfield over a selected 2-D plane at the height approximately level with the listener’s ears using a single array of loudspeakers with 3-D reverberant settings. Then, we propose an adaptive reverberation cancelation method for the multizone soundfield reproduction within the desired region and simplify the prior soundfield measurement process. Simulation results suggest that the proposed method provides a faster convergence rate than the comparative approaches under the same hardware provision. Finally, we conduct the real-world implementation based on the proposed theoretical work. The experimental results show that we can achieve a very noticeable acoustic energy contrast between the signals recorded in the bright zone and the quiet zone, especially for the system implementation with reverberation equalization

Victoria University of Wellington

ResearchArchive at Victoria University of Wellington

Implementation of an Autonomous Impulse Response Measurement System

Author: Martinez Ornelas Abraham
Publication venue
Publication date: 21/10/2019
Field of study

Data collection is crucial for researchers, as it can provide important insights for describing phenomena. In acoustics, acoustic phenomena are characterized by Room Impulse Responses (RIRs) occurring when sound propagates in a room. Room impulse responses are needed in vast quantities for various reasons, including the prediction of acoustical parameters and the rendering of virtual acoustical spaces. Recently, mobile robots navigating within indoor spaces have become increasingly used to acquire information about its environment. However, little research has attempted to utilize robots for the collection of room acoustic data. This thesis presents an adaptable automated system to measure room impulse responses in multi-room environments, using mobile and stationary measurement platforms. The system, known as Autonomous Impulse Response Measurement System (AIRMS), is divided into two stages: data collection and post-processing. To automate data collection, a mobile robotic platform was developed to perform acoustic measurements within a room. The robot was equipped with spatial microphones, multiple loudspeakers and an indoor localization system, which reported real time location of the robot. Additionally, stationary platforms were installed in specific locations inside and outside the room. The mobile and stationary platforms wirelessly communicated with one another to perform the acoustical tests systematically. Since a major requirement of the system is adaptability, researchers can define the elements of the system according to their needs, including the mounted equipment and the number of platforms. Post-processing included extraction of sine sweeps and the calculation of impulse responses. Extraction of the sine sweeps refers to the process of framing every acoustical test signal from the raw recordings. These signals are then processed to calculate the room impulse responses. The automatically collected information was complemented with manually produced data, which included rendering of a 3D model of the room, a panoramic picture. The performance of the system was tested under two conditions: a single-room and a multiroom setting. Room impulse responses were calculated for each of the test conditions, representing typical characteristics of the signals and showing the effects of proximity from sources and receivers, as well as the presence of boundaries. This prototype produces RIR measurements in a fast and reliable manner. Although some shortcomings were noted in the compact loudspeakers used to produce the sine sweeps and the accuracy of the indoor localization system, the proposed autonomous measurement system yielded reasonable results. Future work could expand the amount of impulse response measurements in order to further refine the artificial intelligence algorithms

Aaltodoc Publication Archive

Effects of errorless learning on the acquisition of velopharyngeal movement control

Author: Ma E
Masters R
Whitehill T
Wong WK
Publication venue: 'Acoustical Society of America (ASA)'
Publication date: 01/01/2012
Field of study

Session 1pSC - Speech Communication: Cross-Linguistic Studies of Speech Sound Learning of the Languages of Hong Kong (Poster Session)The implicit motor learning literature suggests a benefit for learning if errors are minimized during practice. This study investigated whether the same principle holds for learning velopharyngeal movement control. Normal speaking participants learned to produce hypernasal speech in either an errorless learning condition (in which the possibility for errors was limited) or an errorful learning condition (in which the possibility for errors was not limited). Nasality level of the participants’ speech was measured by nasometer and reflected by nasalance scores (in %). Errorless learners practiced producing hypernasal speech with a threshold nasalance score of 10% at the beginning, which gradually increased to a threshold of 50% at the end. The same set of threshold targets were presented to errorful learners but in a reversed order. Errors were defined by the proportion of speech with a nasalance score below the threshold. The results showed that, relative to errorful learners, errorless learners displayed fewer errors (50.7% vs. 17.7%) and a higher mean nasalance score (31.3% vs. 46.7%) during the acquisition phase. Furthermore, errorless learners outperformed errorful learners in both retention and novel transfer tests. Acknowledgment: Supported by The University of Hong Kong Strategic Research Theme for Sciences of Learning © 2012 Acoustical Society of Americapublished_or_final_versio

HKU Scholars Hub

Implementation of the Radiation Characteristics of Musical Instruments in Wave Field Synthesis Applications

Author: Ziemer Tim
Publication venue
Publication date: 21/04/2020
Field of study

In this thesis a method to implement the radiation characteristics of musical instruments in wave ﬁeld synthesis systems is developed. It is applied and tested in two loudspeaker systems.Because the loudspeaker systems have a comparably low number of loudspeakers the wave ﬁeld is synthesized at discrete listening positions by solving a linear equation system. Thus, for every constellation of listening and source position all loudspeakers can be used for the synthesis. The calculations are done in spectral domain, denying sound propagation velocity at ﬁrst. This approach causes artefacts in the loudspeaker signals and synthesis errors in the listening area which are compensated by means of psychoacoustic methods. With these methods the aliasing frequency is determined by the extent of the listening area whereas in other wave ﬁeld synthesis systems it is determined by the distance of adjacent loudspeakers. Musical instruments are simpliﬁed as complex point sources to gain, store and propagate their radiation characteristics. This method is the basis of the newly developed “Radiation Method” which improves the matrix conditioning of the equation system and the precision of the wave ﬁeld synthesis by implementing the radiation characteristics of the driven loudspeakers. In this work, the “Minimum Energy Method” — originally developed for acoustic holography — is applied for matters of wave ﬁeld synthesis for the ﬁrst time. It guarantees a robust solution and creates softer loudspeaker driving signals than the Radiation Method but yields a worse approximation of the wave ﬁeld beyond the discrete listening positions. Psychoacoustic considerations allow for a successfull wave ﬁeld synthesis: Integration times of the auditory system determine the spatial dimensions in which the wave ﬁeld synthesis approach works despite diﬀerent arrival times and directions of wave fronts. By separating the spectrum into frequency bands of the critical band width, masking eﬀects are utilized to reduce the amount of calculations with hardly audible consequances. By applying the “Precedence Fade”, the precedence eﬀect is used to manipulate the perceived source position and improve the reproduction of initial transients of notes. Based on Auditory Scene Analysis principles, “Fading Based Panning” creates precise phantom source positions between the actual loudspeaker positions. Physical measurements, simulations and listening tests prove evidence for the introduced methods and reveal their precision. Furthermore, results of the listening tests show that the perceived spaciousness of instrumental sound not necessarily goes along with distinctness of localization. The introduced methods are compatible to conventional multi channel audio systems as well as other wave ﬁeld synthesis applications.In dieser Arbeit wird eine Methode entwickelt, um die Abstrahlcharakteristik von Musikinstrumenten in Wellenfeldsynthesesystemen zu implementieren. Diese wird in zwei Lautsprechersystemen umgesetzt und getestet. Aufgrund der vergleichsweise geringen Anzahl an Lautsprechern wird das Schallfeld an diskreten Hörpositionen durch Lösung eines linearen Gleichungssystems resynthetisiert. Dadurch können für jede Konstellation aus Quellen- und Hörposition alle Lautsprecher für die Synthese verwendet werden. Hierzu wird zunächst in Frequenzebene, unter Vernachlässigung der Ausbreitungsgeschwindigkeit des Schalls gerechnet. Dieses Vorgehen sorgt für Artefakte im Schallsignal und Synthesefehler im Hörbereich, die durch psychoakustische Methoden kompensiert werden. Im Vergleich zu anderen Wellenfeldsyntheseverfahren wird bei diesem Vorgehen die Aliasingfrequenz durch die Größe des Hörbereichs und nicht durch den Lautsprecherabstand bestimmt. Musikinstrumente werden als komplexe Punktquellen vereinfacht, wodurch die Abstrahlung erfasst, gespeichert und in den Raum propagiert werden kann. Dieses Vorgehen ist auch die Basis der neu entwickelten “Radiation Method”, die durch Einbeziehung der Abstrahlcharakteristik der verwendeten Lautsprecher die Genauigkeit der Wellenfeldsynthese erhöht und die Konditionierung der Propagierungsmatrix des zu lösenden Gleichungssystems verbessert. In dieser Arbeit wird erstmals die für die akustische Holograﬁe entwickelte “Minimum Energy Method” auf Wellenfeldsynthese angewandt. Sie garantiert eine robuste Lösung und erzeugt leisere Lautsprechersignale und somit mehr konstruktive Interferenz, approximiert das Schallfeld jenseits der diskreten Hörpositionen jedoch schlechter als die Radiation Method. Zahlreiche psychoakustische Überlegungen machen die Umsetzung der Wellenfeldsynthese möglich: Integrationszeiten des Gehörs bestimmen die räumlichen Dimensionen in der die Wellenfeldsynthesemethode — trotz der aus verschiedenen Richtungen und zu unterschiedlichen Zeitpunkten ankommenden Wellenfronten — funktioniert. Durch Teilung des Schallsignals in Frequenzbänder der kritischen Bandbreite wird unter Ausnutzung von Maskierungseﬀekten die Anzahl an nötigen Rechnungen mit kaum hörbaren Konsequenzen reduziert. Mit dem “Precedence Fade” wird der Präzedenzeﬀekt genutzt, um die wahrgenommene Schallquellenposition zu beeinﬂussen. Zudem wird dadurch die Reproduktion transienter Einschwingvorgänge verbessert. Auf Grundlage von Auditory Scene Analysis wird “Fading Based Panning” eingeführt, um darüber hinaus eine präzise Schallquellenlokalisation jenseits der Lautsprecherpositionen zu erzielen. Physikalische Messungen, Simulationen und Hörtests weisen nach, dass die neu eingeführten Methoden funktionieren und zeigen ihre Präzision auf. Auch zeigt sich, dass die wahrgenommene Räumlichkeit eines Instrumentenklangs nicht der Lokalisationssicherheit entspricht. Die eingeführten Methoden sind kompatibel mit konventionellen Mehrkanal-Audiosystemen sowie mit anderen Wellenfeldsynthesesystemen

Qucosa

HSSS - Hochschulschriftenserver der SLUB

Sächsische Landesbibliothek - Staats- und Universitätsbibliothek Dresden (SLUB): Qucosa

Audio for Virtual, Augmented and Mixed Realities: Proceedings of ICSA 2019 ; 5th International Conference on Spatial Audio ; September 26th to 28th, 2019, Ilmenau, Germany

Author: Verband Deutscher Tonmeister
Publication venue
Publication date: 20/11/2019
Field of study

The ICSA 2019 focuses on a multidisciplinary bringing together of developers, scientists, users, and content creators of and for spatial audio systems and services. A special focus is on audio for so-called virtual, augmented, and mixed realities. The fields of ICSA 2019 are: - Development and scientific investigation of technical systems and services for spatial audio recording, processing and reproduction / - Creation of content for reproduction via spatial audio systems and services / - Use and application of spatial audio systems and content presentation services / - Media impact of content and spatial audio systems and services from the point of view of media science. The ICSA 2019 is organized by VDT and TU Ilmenau with support of Fraunhofer Institute for Digital Media Technology IDMT

Digitale Bibliothek Thüringen