8 research outputs found

    IS 485 - 001: Machine Listening

    Get PDF

    Parametrization, auralization, and authoring of room acoustics for virtual reality applications

    Get PDF
    The primary goal of this work has been to develop means to represent acoustic properties of an environment with a set of spatial sound related parameters. These parameters are used for creating virtual environments, where the sounds are expected to be perceived by the user as if they were listened to in a corresponding real space. The virtual world may consist of both visual and audio components. Ideally in such an application, the sound and the visual parts of the virtual scene are in coherence with each other, which should improve the user immersion in the virtual environment. The second aim was to verify the feasibility of the created sound environment parameter set in practice. A virtual acoustic modeling system was implemented, where any spatial sound scene, defined by using the developed parameters, can be rendered audible in real time. In other words the user can listen to the auralized sound according to the defined sound scene parameters. Thirdly, the authoring of creating such parametric sound scene representations was addressed. In this authoring framework, sound scenes and an associated visual scene can be created to be further encoded and transmitted in real time to a remotely located renderer. The visual scene counterpart was created as a part of the multimedia scene acting simultaneously as a user interface for renderer-side interaction.reviewe

    A study on sound source apparent shape and wideness

    Get PDF
    Proceedings of the 9th International Conference on Auditory Display (ICAD), Boston, MA, July 7-9, 2003.This work is intended as an initial investigation into the perception of wideness and shape of sound sources. A method that employs multiple uncorrelated point sources is used in order to form ``sound shapes''. Several experiments were carried out in which, after some initial training, subjects were asked to indentify the shapes that were being played. Results indicate that differences in vertical and horizontal source wideness are easily perceived and scenes that use broad sound sources to represent normally large sound objects are selected 70% of the time over point source versions. However, shape identification was found to be more ambiguous except for certain types of signals where results were above statistical probability. The work indicates that shape and wideness of sound sources could be effectively used as extra cues in virtual auditory displays and generally improve the realism of virtual 3D sound scenes. This work was performed as a Core Experiment within the MPEG Audio Subgroup with the intention of possible integration of source wideness into MPEG-4 AudioBIFS

    Future spatial audio : Subjective evaluation of 3D surround systems

    Get PDF
    Current surround systems are being developed to include height channels to provide the listener with a 3D listening experience. It is not well understood the impact the height channels will have on the listening experience and aspects associated with multichannel reproduction like localisation and envelopment or if there are any new subjective attributes concerned with 3D surround systems. Therefore in this research subjective factors like localisation and envelopment were investigated and then descriptive analysis was used. In terms of localisation it was found that for sources panned in the median plane localisation accuracy was not improved with higher order ambisonics. However for sources in the frontal plane higher order ambisonics improves localisation accuracy for elevated sound sources. It was also found that for a simulation of a number of 2D and 3D surround systems, using a decorrelated noise signal to simulate a diffuse soundfield, there was no improvement in envelopment with the addition of height. On the other hand height was found to improve the perception of envelopment with the use of 3D recorded sound scenes, although for an applause sample which had similar properties to that of the decorrelated noise sample there was no significant difference between 2D and 3D systems. Five attribute scales emerged from the descriptive analysis of which it was found that there were significant differences between 2D and 3D systems using the attribute scale size for both ambisonics and VBAP rendered systems. Also 3D higher order ambisonics significantly enhances the perception of presence. A final principal component analysis found that there were 2 factors which characterised the ambisonic rendered systems and 3 factors which characterised the VBAP rendered sound scenes. This suggests that the derived scales need to be used with a wider number of sound scenes in order to fully validate them

    Cinematic Sound Scene Description and Rendering Control

    No full text

    Analiz-temelli Sentez Yöntemleriyle Uzamsal Ses Üretimi

    No full text
    TÜBİTAK EEEAG Proje01.03.2018Bu projenin amacı terminalden bağımsız, sondan-sona bir nesne-temelli ses üretimi yöntemi geliştirilmesidir. Bu amaca yönelik olarak 1) açık küresel mikrofon dizisi tasarımı ve gerçekleştirilmesi, 2) ses nesnelerinin kaydedilmesine olanak sağlayacak mikrofon dizisi sinyal işleme yöntemleri geliştirilmesi, 3) ses sahnelerinin betimlenmesine olanak sağlayan bir metadata biçemi geliştirilmesi, 4) ses sahnesinin düzenlenebilmesine olanak sağlayacak bir editör geliştirilmesi ve 5) ses sahnelerinin etkileşimli olarak geri çatılabilmesini sağlayacak esnek yöntemler geliştirilmesi planlanmıştır. Projenin ilk çıktılarından biri 13 mikrofondan oluşan ve akustik yeğinlik ölçümüne olanak sağlayan bir açık küresel mikrofon dizisinin tasarımı ve uygulanması olmuştur. Bu mikrofon dizisinin kalibrasyonu ve testleri yapılmış ve bir sonraki adımda geliştirilen bazı algoritmalarda kullanılacak olan dürtü cevabı ölçümlerinin yapılmasında kullanılmıştır. Mikrofon dizisi sinyal işleme alanında yapılan çalışmalarda ses varış yönü kestirimi ve ses kaynak ayırma işlemlerinde kullanılacak açık ve kapalı mikrofon dizileri için ayrı ayrı olmak üzere yeni ve özgün yöntemler geliştirilmiştir. Açık küresel mikrofon dizileri için geliştirilen yöntemler, akustik yeğinlik temelli varış yönü kestirimi algoritmaları ve dördey sinyal işleme temelli ses kaynak ayırma yöntemleri olmuştur. Kapalı küresel mikrofon dizileri için ise küresel harmonik alanda uzamsal entropi kavramını kullanan yeni bir varış yönü kestirimi yöntemi ve karmaşık dikgen eşleştirmeli izleme yöntemini kullanan bir ses kaynak ayırma yöntemi geliştirilmiştir. Ses sahnelerinin betimlenmesine olanak sağlayacak, SpatDIF biçemini genişleten yeni bir metaveri biçemi tasarlanmış ve bu biçemi düzenlemeye olanak sağlayan görsel bir editör tasarlanmıştır. Son olarak, ses sahnelerinin geri çatılmasında kullanılmak üzere gerçek zamanda çalışabilen bir oda akustiği simülatörü / yapay yankışımcı geliştirilmiştir. Bu simülatörün gerçekçiliğini arttırmak için birbirine bağlı hacimler ve kırılım modellerinin sistemle tümleştirilmesi çalışmaları yapılmış ve başarılı sonuçlar alınmıştır. Proje sonucunda iki dergi ve iki konferans yayını yapılmıştır. Bu yayınlara ek olarak Mart 2018?de bir dergi makalesi, Nisan 2018?de ise bir yeni konferans bildirisi değerlendirilmek üzere gönderilmiştir. Ayrıca biri yurtdışında davetli konuşma olmak üzere iki eğitim semineri verilmiştir.The aim of this project is to develop a terminal-agnostic, end-to-end object-based audio reproduction system. To that aim, the work carried out in the project consisted of 1) the design and development of an open spherical microphone array, 2) development of microphone array signal processing methods, 3) development of a scene description metadata format, 4) design and development of a sound scene editor, and 5) development of flexible synthesis methods that can be used to reconstruct the intended sound field. One of the first outcomes of the project is a 13-channel open spherical microphone array that allows the measurement of acoustic intensity. This array was calibrated, tested and employed in recording acoustic impulse responses to be used later on with the algorithms that were developed in the subsequent stages of the project. In the field of microphone array signal processing, the emphasis was on the development of novel algorithms for direction-of-arrival (DOA) estimation and acoustic source separation both for open and rigid spherical microphone arrays. Methods developed for open spherical microphone arrays used acoustic intensity as a basis for DOA estimation and quaternion signal processing methods for source separation. Methods developed for rigid spherical microphone arrays operate in the spherical harmonic domain and use spatial entropy for DOA estimation and complex orthogonal matching pursuits (OMP) for acoustic source separation. A new metadata format for sound scene description which augments SpatDIF was developed alongside a visual tool which allows editing the metadata. Finally, a room acoustics simulator / artificial reverberator was designed to allow interactively reconstructing sound scenes. In order to extend the capabilities of this simulator, models of coupled volumes and edge diffraction were integrated with SDN-type reverberators with good results. Two journal articles and two conference papers were published during the project. In addition, a journal article was submitted in March 2018 and a conference papers was submitted in April 2018. Two lectures, one of which was an invited lecture abroad, based on the project findings were given
    corecore