9 research outputs found

    GPU-Based One-Dimensional Convolution for Real-Time Spatial Sound Generation

    Get PDF
    Incorporating spatialized (3D) sound cues in dynamic and interactive videogames and immersive virtual environment applications is beneficial for a number of reasons, ultimately leading to an increase in presence and immersion. Despite the benefits of spatial sound cues, they are often overlooked in videogames and virtual environments where typically, emphasis is placed on the visual cues. Fundamental to the generation of spatial sound is the one-dimensional convolution operation which is computationally expensive, not lending itself to such real-time, dynamic applications. Driven by the gaming industry and the great emphasis placed on the visual sense, consumer computer graphics hardware, and the graphics processing unit (GPU) in particular, has greatly advanced in recent years, even outperforming the computational capacity of CPUs. This has allowed for real-time, interactive realistic graphics-based applications on typical consumer- level PCs. Given the widespread use and availability of computer graphics hardware and the similarities that exist between the fields of spatial audio and image synthesis, here we describe the development of a GPU-based, one-dimensional convolution algorithm whose efficiency is superior to the conventional CPU-based convolution method. The primary purpose of the developed GPU-based convolution method is the computationally efficient generation of real- time spatial audio for dynamic and interactive videogames and virtual environments

    Binaural Spatialization for 3D immersive audio communication in a virtual world

    Get PDF
    Realistic 3D audio can greatly enhance the sense of presence in a virtual environment. We introduce a framework for capturing, transmitting and rendering of 3D audio in presence of other bandwidth savvy streams in a 3D Tele-immersion based virtual environment. This framework presents an efficient implementation for 3D Binaural Spatialization based on the positions of current objects in the scene, including animated avatars and on the fly reconstructed humans. We present a general overview of the framework, how audio is integrated in the system and how it can exploit the positions of the objects and room geometry to render realistic reverberations using head related transfer functions. The network streaming modules used to achieve lip-synchronization, high-quality audio frame reception, and accurate localization for binaural rendering are also presented. We highlight how large computational and networking challenges can be addressed efficiently. This represents a first step in adequate networking support for Binaural 3D Audio, useful for telepresence. The subsystem is successfully integrated with a larger 3D immersive system, with state of art capturing and rendering modules for visual data

    Real-time massive convolution for audio applications on GPU

    Full text link
    [EN] Massive convolution is the basic operation in multichannel acoustic signal processing. This field has experienced a major development in recent years. One reason for this has been the increase in the number of sound sources used in playback applications available to users. Another reason is the growing need to incorporate new effects and to improve the hearing experience. Massive convolution requires high computing capacity. GPUs offer the possibility of parallelizing these operations. This allows us to obtain the processing result in much shorter time and to free up CPU resources. One important aspect lies in the possibility of overlapping the transfer of data from CPU to GPU and vice versa with the computation, in order to carry out real-time applications. Thus, a synthesis of 3D sound scenes could be achieved with only a peer-to-peer music streaming environment using a simple GPU in your computer, while the CPU in the computer is being used for other tasks. Nowadays, these effects are obtained in theaters or funfairs at a very high cost, requiring a large quantity of resources. Thus, our work focuses on two mains points: to describe an efficient massive convolution implementation and to incorporate this task to real-time multichannel-sound applications. © 2011 Springer Science+Business Media, LLC.This work was partially supported by the Spanish Ministerio de Ciencia e Innovacion (Projects TIN2008-06570-C04-02 and TEC2009-13741), Universidad Politecnica de Valencia through PAID-05-09 and Generalitat Valenciana through project PROMETEO/2009/2013Belloch Rodríguez, JA.; Gonzalez, A.; Martínez Zaldívar, FJ.; Vidal Maciá, AM. (2011). Real-time massive convolution for audio applications on GPU. Journal of Supercomputing. 58(3):449-457. https://doi.org/10.1007/s11227-011-0610-8S449457583Spors S, Rabenstein R, Herbordt W (2007) Active listening room compensation for massive multichannel sound reproduction system using wave-domain adaptive filtering. J Acoust Soc Am 122:354–369Huang Y, Benesty J, Chen J (2008) Generalized crosstalk cancellation and equalization using multiple loudspeakers for 3D sound reproduction at the ears of multiple listeners. In: IEEE int conference on acoustics, speech and signal processing, Las Vegas, USA, pp 405–408Cowan B, Kapralos B (2008) Spatial sound for video games and virtual environments utilizing real-time GPU-based convolution. In: Proceedings of the ACM FuturePlay 2008 international conference on the future of game design and technology, Toronto, Ontario, Canada, November 3–5Belloch JA, Vidal AM, Martinez-Zaldivar FJ, Gonzalez A (2010) Multichannel acoustic signal processing on GPU. In: Proceedings of the 10th international conference on computational and mathematical methods in science and engineering, vol 1. Almeria, Spain, June 26–30, pp 181–187Cowan B, Kapralos B (2009) GPU-based one-dimensional convolution for real-time spatial sound generation. Sch J 3(5)Soliman SS, Mandyam DS, Srinath MD (1997) Continuous and discrete signals and systems. Prentice Hall, New YorkOppenheim AV, Willsky AS, Hamid Nawab S (1996) Signals and systems. Prentice Hall, New YorkopenGL: http://www.opengl.org/MKL library: http://software.intel.com/en-us/intel-mkl/MKL library: http://software.intel.com/en-us/intel-ipp/CUFFT library: http://developer.download.nvidia.com/compute/cuda/3_1/toolkit/docs/CUFFT_Library_3.1.pdfCUDA Toolkit 3.1: http://developer.nvidia.com/object/cuda_3_1_downloads.htmlCUDA Toolkit 3.2: http://developer.nvidia.com/object/cuda_3_1_downloads.htmlDatasheet of AC’97 SoundMAX Codec: http://www.xilinx.com/products/boards/ml505/datasheets/87560554AD1981B_c.pd

    Between play and design : the emergence of hybrid-identity in single-player videogames

    Full text link
    Pour respecter les droits auteur, la version electronique de cette thèse a été dépouillée de ses documents visuels et audio-visuels. La version intégrale de la thèse a été déposée au Service de la gestion des documents et des archives de l'Université de Montréal.Résumé Cette thèse examine la nature complexe de l'identité dans les jeux vidéo solo. Elle introduit la notion d'identité-hybride, et propose un cadre analytique pour déconstruire la jouabilité à travers les genres afin de distinguer des moments d’émergence d’identité. Alors que la recherche sur l’identité se concentre couramment sur le joueur ou le personnage-joueur (ou les deux), la notion d'identité-hybride est une forme d’identité fluide, parfois éphémère, qui existe entre le joueur et le personnage-joueur. L’identité-hybride se développe au cours du processus de jeu et inclut nécessairement le joueur (expérience, contexte de jeu, etc. ), l’environnement du jeu (le design, les mécaniques, etc.), et la médiation technologique (ordinateur, console, etc.) qui facilite la jouabilité. Afin de cerner les différents aspects du gameplay qui contribuent a l'émergence de différents types d'identité, un cadre multiforme a été conçu pour isoler les interactions spécifiques? qui comprennent les interactions joueur/personnage-joueur, personnage-joueur/personnage non-joueur, joueur/environnement du jeu, personnage-joueur /environnement de jeux, et joueur/joueur. Il a été associé à un cadre secondaire qui comprend l'examen des spécificités du joueur individuel et la médiation technologique qui facilitent le jouabilité. Une analyse systématique d’expériences de jeu et des éléments de design de trois jeux différents; Mirror’s Edge (DICE, 2008), Alone in the Dark (Eden Games, 2008), et Fable 2 (Lionhead Studios, 2008), a été réalisée pour illustrer les différents degrés d’apparition d'identité dans différentes structures de jeu. En comparant les trois analyses, l'utilité de ce cadre pour mettre de l’avant les éléments qui contribuent au (ou peuvent entraver) le développement de l'identité et, plus spécifiquement, l'apparition de l'identité-hybride, est démontrée. Ces trois exemples jettent les bases d'une discussion plus profonde sur la définition, le contexte, et le processus d’identité-hybride dans les jeux vidéo en général.Abstract This dissertation examines the complex nature of identity in single-player videogames. It introduces the concept of hybrid-identity and proposes an analytical framework to deconstruct gameplay across genres to distinguish moments of identity emergence. While identity research commonly focuses on the player or the player-character (or both), the concept of hybrid-identity is a fluid, at times fleeting form of identity that exists between the player and the player-character. Hybrid-identity develops during the networked process of videogame play and necessarily includes the player (experience, play-context, etc.), the game environment (design, mechanics, etc.), and the mediating technology (computer, console, etc.) that facilitates gameplay. In order to delineate the different aspects of gameplay that contribute to the emergence of different types of identity, a multifaceted framework was devised to isolate specific interactions between the player/player-character, player-character/non-playing character, player/game environment, player-character/game environment, and player/player. This framework was coupled with a secondary frame which includes examining the specificities of the individual player and the mediating technologies that facilitate gameplay. A systematic analysis of gameplay and design elements of three different games; Mirror’s Edge (DICE, 2008), Alone in the Dark (Eden Games, 2008), and Fable 2 (Lionhead Studios, 2008) was performed to illustrate the varying degrees of identity emergence in different game structures. The utility of the framework is demonstrated by comparing the three gameplay analyses and highlighting the elements that contribute to (and possibly hinder) identity development and more specifically, the emergence of hybrid-identity. These three examples form the foundation for a more in-depth discussion on the definition, context, and process of hybrid-identity in videogame play

    Virtual Guidance using Mixed Reality in Historical Places and Museums

    Get PDF
    Mixed Reality (MR) is one of the most disruptive technologies that shows potential in many application domains, particularly in the tourism and cultural heritage sector. MR using the latest headsets with the highest capabilities introduces a new visual platform that can change people’s visual experience. This thesis introduces a HoloLens-based mixed reality guidance system for museums and historical places. This new guidance form considers the inclusiveness of the necessary and optimised functionalities, visual and audio guiding abilities, essential roles of a guide, and the related social interactions in the real-time. A mixed reality guide, dubbed ‘MuseumEye’ was designed and developed for the Egyptian Museum in Cairo, to overcome challenges currently facing the museum, e.g. lack of guiding methods, limited information signposted on the exhibits, lack of visitor engagement resulting in less time spent in the museum compared to other museums with similar capacity and significance. These problems motivated the researcher to conduct an exploratory study to investigate the museum environment and guiding methods by interviewing 10 participants and observing 20 visitors. ‘MuseumEye’ was built based on a literature review of immersive systems in museums and the findings of an exploratory study that reveals visitor behaviours and the nature of guidance in the museum. This project increased levels of engagement and the length of time visitors spend in museums, the Egyptian Museum in Cairo in particular, using the mixed reality technology that provides visitors with additional visual, audio information and computer-generated images at various levels of details and via different media. This research introduces the guidelines of designing immersive reality guide applications using the techniques of spatial mapping, designing the multimedia and UI, and designing interactions for exploratory purposes. The main contributions of this study include various theoretical contributions: 1) creating a new form of guidance that enhances the museum experience through developing a mixed reality system; 2) a theoretical framework that assesses mixed reality guidance systems in terms of perceived usefulness, ease of use, enjoyment, interactivity, the roles of a guide and the likelihood of future use; 3) the Ambient Information Visualisation Concept for increasing visitor engagement through better presenting information and enhancing communication and interaction between visitors and exhibits; and a practical contribution in creating a mixed reality guidance system that reshapes the museum space, enhances visitors’ experience and significantly increases the length of time they spend in the museum. The evaluation comprised of quantitative surveys (171 participants and 9 experts) and qualitative observation (51 participants) using MuseumEye in their tours. The results showed positive responses for all measured aspects and compares these to similar studies. The observation results showed that visitors who use MuseumEye spent four times the duration visitors spent without guides or with human guides in front of exhibited items. The quantitative results showed significant correlations between the measured constructs (perceived usefulness, ease of use, enjoyment, multimedia and UI, interactivity) and the likelihood of future use when the roles of guide mediate the relations. Moreover, the ‘perceived guidance’ is the most influential construct on the likelihood of future use of MuseumEye. The results also revealed a high likelihood of future use, which ensures the sustainability of adopting mixed reality technology in museums. This thesis shows the potential of mixed reality guides in the museum sector that reshape the museum space and offers endless possibilities for museums and heritage sites

    Virtual Guidance using Mixed Reality in Historical Places and Museums

    Get PDF
    Mixed Reality (MR) is one of the most disruptive technologies that shows potential in many application domains, particularly in the tourism and cultural heritage sector. MR using the latest headsets with the highest capabilities introduces a new visual platform that can change people’s visual experience. This thesis introduces a HoloLens-based mixed reality guidance system for museums and historical places. This new guidance form considers the inclusiveness of the necessary and optimised functionalities, visual and audio guiding abilities, essential roles of a guide, and the related social interactions in the real-time. A mixed reality guide, dubbed ‘MuseumEye’ was designed and developed for the Egyptian Museum in Cairo, to overcome challenges currently facing the museum, e.g. lack of guiding methods, limited information signposted on the exhibits, lack of visitor engagement resulting in less time spent in the museum compared to other museums with similar capacity and significance. These problems motivated the researcher to conduct an exploratory study to investigate the museum environment and guiding methods by interviewing 10 participants and observing 20 visitors. ‘MuseumEye’ was built based on a literature review of immersive systems in museums and the findings of an exploratory study that reveals visitor behaviours and the nature of guidance in the museum. This project increased levels of engagement and the length of time visitors spend in museums, the Egyptian Museum in Cairo in particular, using the mixed reality technology that provides visitors with additional visual, audio information and computer-generated images at various levels of details and via different media. This research introduces the guidelines of designing immersive reality guide applications using the techniques of spatial mapping, designing the multimedia and UI, and designing interactions for exploratory purposes. The main contributions of this study include various theoretical contributions: 1) creating a new form of guidance that enhances the museum experience through developing a mixed reality system; 2) a theoretical framework that assesses mixed reality guidance systems in terms of perceived usefulness, ease of use, enjoyment, interactivity, the roles of a guide and the likelihood of future use; 3) the Ambient Information Visualisation Concept for increasing visitor engagement through better presenting information and enhancing communication and interaction between visitors and exhibits; and a practical contribution in creating a mixed reality guidance system that reshapes the museum space, enhances visitors’ experience and significantly increases the length of time they spend in the museum. The evaluation comprised of quantitative surveys (171 participants and 9 experts) and qualitative observation (51 participants) using MuseumEye in their tours. The results showed positive responses for all measured aspects and compares these to similar studies. The observation results showed that visitors who use MuseumEye spent four times the duration visitors spent without guides or with human guides in front of exhibited items. The quantitative results showed significant correlations between the measured constructs (perceived usefulness, ease of use, enjoyment, multimedia and UI, interactivity) and the likelihood of future use when the roles of guide mediate the relations. Moreover, the ‘perceived guidance’ is the most influential construct on the likelihood of future use of MuseumEye. The results also revealed a high likelihood of future use, which ensures the sustainability of adopting mixed reality technology in museums. This thesis shows the potential of mixed reality guides in the museum sector that reshape the museum space and offers endless possibilities for museums and heritage sites

    Desarrollo de herramientas de procesado y visualización para audio 3D con auriculares

    Full text link
    La Auralización o “realidad virtual acústica” es un término relativamente nuevo. Integra métodos de la física y la ingeniería acústica con la teoría de la Psicoacústica y de reproducción electroacústica [1]. El término Auralización es el análogo de la técnica de “visualización” en video 3D para el audio. En este Proyecto Fin de Carrera se describe el proceso de visualizar ciertas características, efectos o señales del sonido. Los sistemas estéreo convencionales son capaces de posicionar la imagen sonora o evento auditivo solamente en el arco virtual que une los dos altavoces. Una extensión directa de estos sistemas fueron los sistemas de sonido envolvente o sonido Surround, en donde se usan más de dos altavoces para crear una imagen sonora que se puede mover por todo el círculo que contiene a los altavoces. Por otro lado, los nuevos sistemas de audio 3D pueden posicionar la imagen sonora, usando solo altavoces (o unos auriculares), en cualquier punto de un espacio tridimensional alrededor del oyente. La Auralización describe el proceso de generación, procesado y playback de audio Surround a la altura de los oídos del oyente. Aplicaciones comunes son la resolución de un problema de Acustica, la mejora de una sala, la simulación de la respuesta en frecuencia de los altavoces para escucha con auriculares, la construcción de un edificio, un coche u otros productos. Ya que el fin último de los sistemas de audio 3D es convencer a los usuarios u oyentes de que el sonido es emitido desde una posición en la sala donde no existe físicamente una fuente o altavoz, no solo los parámetros físicos sino también los psicoacústicos juegan un papel fundamental en el diseño del sistema. El concepto de conseguir sonido tridimensional fue investigado por primera vez en relación con la modelización de campos sonoros en salas en 1929. Spandöck procesó señales derivadas de medidas en un modelo a escala de la sala con el fin de poder escuchar la acústica de la sala en el mismo laboratorio. La idea fue bien recibida, pero en esa época no había medios para ponerla en práctica. Veinte años después, en 1949, se inventaba la cinta magnética. Spandöck presenta finalmente su sistema basado en señales ultrasónicas, modelos de salas a escala y un grabador de cinta trabajando a diferentes velocidades. Los elementos básicos de la auralización se pusieron de manifiesto con este trabajo: modelado de campos sonoros, procesado y reproducción del sonido. Con el tremendo desarrollo de los computadores, el concepto de simulación y auralización fue reinventado por Schroeder a principios de 1960. No es hasta después, en la década de 1990 para cuando la era del procesado digital (DSP), las velocidades de procesador y espacio de memoria se hacen suficientemente potentes como para procesar simulaciones en ordenadores personales, el momento donde se introduce oficialmente el término Auralización. Otros campos de la acústica también han incluido este término en su argot para referirse a fenómenos relacionados con la espacialización del audio, particularmente en los ámbitos de ingeniería de sonido y acústica arquitectónica. Desde entonces, el software y hardware se ha perfeccionado considerablemente y hoy en día el software comercial para la simulación de salas acústicas se considera incompleto sin una opción de auralización mediante la tarjeta de sonido del PC o una interfaz de audio DA/AD. Buena parte del desarrollo de sistemas de audio 3D se ha basado en un único oyente posicionado en entornos anecoicos, lo que simplifica el análisis considerablemente. Sin embargo, esto acarrea normalmente que el sistema solo funcione debidamente en estos entornos aislados acusticamente. Para evitar este condicionamiento, se piensa en que los espacios de escucha sean salas reverberantes y por ello se caractericen con una respuesta al impulso de la sala (RIR) o su análogo en frecuencia la respuesta en frecuencia de la sala (RTF) de larga duración, debido a la reverberación. A una frecuencia de muestreo de 44.1 kHz (estándar de facto y también usada a lo largo de todo este proyecto) se necesitan miles de coeficientes para los filtros FIR que modelen fehacientemente una RIR. Es por ello que los sistemas de audio 3D requieren de una gran capacidad de cómputo por parte del host. Se hace indispensable aplicar la teoría de Fourier, en concreto algoritmos FFT, para trasladar el problema al dominio frecuencial con el fin de reducir la complejidad computacional. Aunque estas respuestas al impulso de larga duración puedan dificultar la implementación en tiempo real, permiten estudiar los efectos de un entorno/sala en el rendimiento del sistema. Los sistemas de audio 3D filtran señales de audio monofónicas mediante una matriz de filtros digitales que depende de la posición de la fuente sonora relativa al oyente; esto es, dependiente de las coordenadas polares (θ, φ, r). En general, las soluciones de estos filtros se componen de dos partes. La primera es la matriz de respuestas en frecuencia relacionadas con la cabeza (HRTFs) , que contiene la información direccional que el oyente debe percibir. Los coeficientes de esta matriz se obtienen normalmente de funciones de transferencia generalizadas y medidas previamente, p.ej. mediante una base de datos. La segunda es la red de cancelación de Crosstalk (cancelación de XT), que invierte la matriz de funciones de transferencia acústicas (entre altavoces y oídos del oyente) de la manera más realista y eficiente posible. Ya que las HRTFs varían considerablemente de un humano a otro debido a la compleja estructura de estas funciones, que dependen de la complexión física y psíquica así como de la estructura geométrica única de cada oído humano, calcular los filtros mediante HRTFs generalizadas degrada la imagen sonora percibida. En este Proyecto Fin de Carrera se desea describir en profundidad el estado del arte de estos sistemas así como crear un sistema de audio 3D de estas características usando el software Matlab® R2014b. Para ello, se calculan RIRs mediante una función específica para ello y las HRIRs se obtienen de bases de datos; estas ultimas se implementaron de cuatro formas. La primera es mediante un sencillo modelo matemático que modele una HRTF. Las dos siguientes son dos bases de datos de HRTFs, una elaborada en el MIT Media Lab [1] en Estados Unidos de América y otra por la universidad de Peking PKU en China, la última con la ventaja que depende también de la distancia fuente-receptor y que incluyen HRTFs para cada oído izquierdo (L) y derecho (R). El número de muestras y la frecuencia de muestreo para cada HRTF son fijas y valen 512 muestras y 44.1 kHz, respectivamente. Cada una de estas funciones corresponde con una respuesta al impulso finita (filtro FIR) con 512 coeficientes o taps. La última de las cuatros formas en la que se implementaron HRTFs en este Proyecto Fin de Carrera fue interpolando en las tres coordenadas (θ, φ, r) las HRTFs de la base de datos de la PKU. Si el sistema de auralización convoluciona un sonido con una BRIR que corresponda, por ejemplo, a un entorno reverberante con un tiempo de reverberación de aprox. 2 segundos, cada BRIR tendrá aproximadamente 23000 coeficientes a 44.1 kHz. Por tanto, se precisan métodos de convolución eficientes, procesadores potentes así como sistemas de interpolación y extracción de características binaurales para reducir el volumen de información en la medida de lo posible. Un sistema de auralización en tiempo real de alta calidad se presenta como un verdadero reto para la tecnología actual disponible. La solución es encontrar nuevas teorías y aproximaciones de simulación acústica de entornos y auralización con un balance entre precisión y tiempo de computo requerido para obtener el efecto 3D deseado. En este software de audio 3D desarrollado, la Auralización del audio original se consigue troceando por bloques la señal y dejando que el oyente defina una trayectoria en el espacio que la fuente trazará. Cada bloque de audio (que corresponde a un punto en la trayectoria) se convoluciona con una respuesta el impulso binaural de la sala (BRIR), obtenida de la convolución de la HRIR con la RIR. Los bloques procesados se solapan y suman usando el algoritmo de Solapamiento y Suma (Overlap and Add Algorithm OLA). Así se consiguen dos señales, una para cada oído. Estas señales tendrán que ser reproducidas con auriculares para la mejor experiencia.The Auralization of sound or Acoustic Virtual Reality or 3D Audio are new methods that use Physics and Sound Engineering together with the Psychoacoustic theory. Auralization of sound is the analogous of Visualization in the area of 3D Video. In this M Sc Thesis, the process of visualization of certain characteristics, effects and audio signals are investigated and developed. Conventional stereophonic systems are capable of positioning the sound image (or auditory event) only between the arc spanned by the two loudspeakers. The Surround systems were an extension of the stereophonic systems, where two or more loudspeakers were used to create an auditory image that can move through the whole circle spanned by the various loudspeakers. However, the newer 3D audio systems are capable of positioning the sound image at any point in a three-dimensional space using only two loudspeakers (or headphones). The process of auralization is, indeed, the generation, processing and playback pf surround sound at the listeners’ ears. Common applications of auralization are the simulation of a loudspeakers frequency response over headphones, the acoustic treatment of a room or building and also the acoustic simulation in a car or other systems. The goal of a 3D audio system is to trick the perception of the listener in order to make the sound emanate from a position in the room where a loudspeaker isn’t really there. Therefore, not only the physical but also the psychoacoustic parameters play a role in the system design. Three-dimensional sound was first investigated in 1929 related with the modeling of sound fields in rooms. Spandöck built small rooms such that the tests were carried out on a natural scale model. Doing so, the sound signals could be heard in the lab relatively easy. The idea was subtle, but in 1929 there wasn’t really a technology to put this in practice. Twenty years later the magnetic tape was invented. Spandöck finally brought forward his system based on ultrasonic signals and scaled room models. The fundamental elements of auralization were defined with his work: modeling sound fields, processing and reproduction of sound. With the formidable development of computers, the concept of simulation and auralization was re-written by Schroeder in the 1960s. Nevertheless, it is not until the 1990s when the DSPs, computer run-times and memories were big enough to run simulations in personal computers. It is then when the term Auralization is officially introduced. Other fields in Acoustics like in Audio Engineering have also introduced the term auralization to refer to the spatialization of sound. Since the 1990s, software and hardware have been improved considerably and nowadays commercial software for the simulation of sound in rooms is considered incomplete without an option of auralization via a sound interface or an AD/DA card. Much of the development of 3D audio systems has been based on a single user/microphone positioned in anechoic environments. This makes the analysis much easier but makes the system only usable in these acoustic isolated environments. To prevent this, one thinks in environment as a reverberant room modeled via a room impulse response (RIR) or the equivalent room transfer function (RTF). The RIR has a considerable duration because of the reverberation. Establishing the sample rate to 44.1 kHz (standard de facto and also the standard for this Thesis) one requires thousands of taps for the FIR filters that correctly model the RIR. This is the reason that 3D audio systems need great amounts of computing capacity by the host. Because of it the Fourier theory is indispensable: FFT algorithms for looking at the problem in the frequency domain and so reduce the complexity. Although these RIRs may cause difficulty in the implementation in real time, they enable to study the effects of a room in the global system. 3D audio systems filter audio signals using a matrix of filters that account for the position of the sound source relative to the receiver. That is, dependent on the polar coordinates (r, θ, ϕ). Generally speaking, the solutions to these filters are made up of two pieces. The first one is the Head-Related Transfer Functions (HRTF) matrix, which holds the directional information for the receiver. The matrix coefficients are derived from transfer functions which were previously measured or from a data base. The second one is the Crosstalk Cancelling Network. It reverts the acoustic transfer functions matrix (between loudspeakers and the ears of the listener) in the most realistic and efficient way. Because HRTFs vary a lot between humans, using generalized HRTFs degrades the perceived sound stage. The goal of this Thesis is to widely describe these 3D audio systems and also to develop a system using Matlab® R2014b. To this end, RIRs are computed using a function and HRIRs are extracted from data bases in four ways. The first way is to use a simple mathematical model. The second and third way are two HRTFs data bases, one developed at the MIT Media Lab in the USA [1] and the other at the Peking PKU in China. They include HRTFs for each ear left (L) and right (R). The third way has the advantage that it also depends on the source-receiver distance. The number of samples as well as the sample rate are fixed and of value 512 samples and 44.1 kHz, respectively. Each HRTF corresponds to a finite impulse response (FIR filter) with 512 samples or taps. The fourth way that HRTFs were obtained was by interpolating the HRTFs of the PKU database in the polar coordinates (r, θ, ϕ). Efficient convolution methods are required, powerful processors as well as interpolation systems to minimize the amount of data. The reason is that if an auralization system convolves an input sound with a BRIR that corresponds to a reverberation room with a reverberation time of let’s say, 2 seconds, each BRIR will have approx. 23000 taps at 44.1 kHz. An auralization system that operates in real time is a real challenge with the actual technology

    Video Game Acoustics: Perception-Based Sound Design for Interactive Virtual Spaces Submitted

    Get PDF
    Video game acoustics are the various aspects of sound physics that can be represented in a video game, as well as the perception and interpretation of those sound physics by a player. At its core, the research here aims to identify the many functions and considerations of acoustics in interactive virtual spaces, while also building a theoretical foundation for video game acoustics by gathering relevant research from a wide variety of disciplines into a single video game context. The writing here also functions as an informative resource for video game sound designers and is primarily written for that audience. Through a review of the literature it is found that there is research available across many different disciplines that is relevant to video game acoustics, but none that bring it all together and fully explore acoustics in a video game context. Small discussions related to the topic occur sporadically throughout various fields, however there are few of any detailed focus and even fewer with video game sound designers as their intended audience. This scattering and dilution of relevant information validates the need for its distillation into a dedicated discussion. The writing here addresses this gap in the literature and in doing so uncovers aspects of video game acoustics that have not previously been given adequate attention. This thesis accomplishes its aims by combining an interdisciplinary background with an emphasis on simplification to suit the creative field of game sound design. A theoretical foundation is built from several different disciplines, including Acoustics, auditory perception, acoustic simulation, sound theory, spatial presence, film sound, and of course game sound. A twofold physics/perception approach is used to analyse video game acoustics. The human perception of sound has various strengths and weaknesses, which help to identify the aspects of sound physics that are important to provide a player as well as aspects that may be ignored for efficiency reasons. The thesis begins by revealing the many considerations and implications of incorporating acoustics into a video game, followed by an exploration of the perceptual functions of acoustics in virtual spaces. Several conceptual frameworks are then offered to address some of the problems discovered in the previous sections. By the end of the thesis it will be shown that the main purpose of video game acoustics is to provide a player with a natural experience of sound. People working in the video game industry may use the research presented here to cultivate an understanding of how humans can interact with video games through sound physics, and why it is important to improve the quality of this interaction.Thesis (Ph.D.) -- University of Adelaide, Elder Conservatorium of Music, 202
    corecore