183 research outputs found

    Investigation on the Phantom Image Elevation Effect

    Get PDF
    Listening tests have been carried out in order to evaluate the phantom image elevation effect depending on horizontal stereophonic base angle. Seven ecologically valid sound sources as well as four noise sources were tested. Subjects judged the perceived image positions of phantom centre image created with seven loudspeaker base angles. Results generally showed that perceived images were elevated from front to above as the loudspeaker base angle increased up to around 180°. This tendency depended on the spectral characteristics of sound source. The perceived results are explained from both physical and cognitive points of view

    Evaluation of the Phantom Image Effect for Phantom Images

    Get PDF
    This paper introduces the author’s recent research on the elevation effect perceived with horizontal phantom images. Early research in stereophony suggests that a phantom centre image produced by two loudspeakers symmetrically placed from the listener position would be perceived in an elevated position, with its elevation angle increases as the loudspeaker base angle increases. In particular, an image presented from loudspeakers placed around the listener’s sides would be perceived overhead. With 3D audio formats employing height and overhead channels in mind, the aforementioned elevation effect is considered to be useful for creating a virtual overhead loudspeaker image, especially for sound effects using just ear-level loudspeakers (e.g. in downmix scenarios). Another important psychoacoustic principle relevant to 3D audio formats is the so- called ‘pitch-height’ effect, which suggests that the higher the frequency of a sound is the higher its image will be perceived. However, past research in this topic only considered loudspeakers placed in the median plane. From the above background, several subjective experiments have been conducted on the elevation of horizontally oriented phantom image. This paper first presents a vertical localisation test conducted with frontal stereo loudspeakers using octave-band noise stimuli. The results not only confirm the elevation effect for broadband noise, but also show the existence of an elevation effect for middle frequency bands. The second experiment introduced in this paper verifies the existence of the virtual overhead perception depending on loudspeaker base angle but also shows the effect heavily depends on the type of sound source

    Backward Compatible Spatialized Teleconferencing based on Squeezed Recordings

    Get PDF
    Commercial teleconferencing systems currently available, although offering sophisticated video stimulus of the remote participants, commonly employ only mono or stereo audio playback for the user. However, in teleconferencing applications where there are multiple participants at multiple sites, spatializing the audio reproduced at each site (using headphones or loudspeakers) to assist listeners to distinguish between participating speakers can significantly improve the meeting experience (Baldis, 2001; Evans et al., 2000; Ward & Elko 1999; Kilgore et al., 2003; Wrigley et al., 2009; James & Hawksford, 2008). An example is Vocal Village (Kilgore et al., 2003), which uses online avatars to co-locate remote participants over the Internet in virtual space with audio spatialized over headphones (Kilgore, et al., 2003). This system adds speaker location cues to monaural speech to create a user manipulable soundfield that matches the avatar’s position in the virtual space. Giving participants the freedom to manipulate the acoustic location of other participants in the rendered sound scene that they experience has been shown to provide for improved multitasking performance (Wrigley et al., 2009). A system for multiparty teleconferencing requires firstly a stage for recording speech from multiple participants at each site. These signals then need to be compressed to allow for efficient transmission of the spatial speech. One approach is to utilise close-talking microphones to record each participant (e.g. lapel microphones), and then encode each speech signal separately prior to transmission (James & Hawksford, 2008). Alternatively, for increased flexibility, a microphone array located at a central point on, say, a meeting table can be used to generate a multichannel recording of the meeting speech A microphone array approach is adopted in this work and allows for processing of the recordings to identify relative spatial locations of the sources as well as multichannel speech enhancement techniques to improve the quality of recordings in noisy environments. For efficient transmission of the recorded signals, the approach also requires a multichannel compression technique suitable to spatially recorded speech signals

    Future spatial audio : Subjective evaluation of 3D surround systems

    Get PDF
    Current surround systems are being developed to include height channels to provide the listener with a 3D listening experience. It is not well understood the impact the height channels will have on the listening experience and aspects associated with multichannel reproduction like localisation and envelopment or if there are any new subjective attributes concerned with 3D surround systems. Therefore in this research subjective factors like localisation and envelopment were investigated and then descriptive analysis was used. In terms of localisation it was found that for sources panned in the median plane localisation accuracy was not improved with higher order ambisonics. However for sources in the frontal plane higher order ambisonics improves localisation accuracy for elevated sound sources. It was also found that for a simulation of a number of 2D and 3D surround systems, using a decorrelated noise signal to simulate a diffuse soundfield, there was no improvement in envelopment with the addition of height. On the other hand height was found to improve the perception of envelopment with the use of 3D recorded sound scenes, although for an applause sample which had similar properties to that of the decorrelated noise sample there was no significant difference between 2D and 3D systems. Five attribute scales emerged from the descriptive analysis of which it was found that there were significant differences between 2D and 3D systems using the attribute scale size for both ambisonics and VBAP rendered systems. Also 3D higher order ambisonics significantly enhances the perception of presence. A final principal component analysis found that there were 2 factors which characterised the ambisonic rendered systems and 3 factors which characterised the VBAP rendered sound scenes. This suggests that the derived scales need to be used with a wider number of sound scenes in order to fully validate them

    Optimization and improvements in spatial sound reproduction systems through perceptual considerations

    Full text link
    [ES] La reproducción de las propiedades espaciales del sonido es una cuestión cada vez más importante en muchas aplicaciones inmersivas emergentes. Ya sea en la reproducción de contenido audiovisual en entornos domésticos o en cines, en sistemas de videoconferencia inmersiva o en sistemas de realidad virtual o aumentada, el sonido espacial es crucial para una sensación de inmersión realista. La audición, más allá de la física del sonido, es un fenómeno perceptual influenciado por procesos cognitivos. El objetivo de esta tesis es contribuir con nuevos métodos y conocimiento a la optimización y simplificación de los sistemas de sonido espacial, desde un enfoque perceptual de la experiencia auditiva. Este trabajo trata en una primera parte algunos aspectos particulares relacionados con la reproducción espacial binaural del sonido, como son la escucha con auriculares y la personalización de la Función de Transferencia Relacionada con la Cabeza (Head Related Transfer Function - HRTF). Se ha realizado un estudio sobre la influencia de los auriculares en la percepción de la impresión espacial y la calidad, con especial atención a los efectos de la ecualización y la consiguiente distorsión no lineal. Con respecto a la individualización de la HRTF se presenta una implementación completa de un sistema de medida de HRTF y se introduce un nuevo método para la medida de HRTF en salas no anecoicas. Además, se han realizado dos experimentos diferentes y complementarios que han dado como resultado dos herramientas que pueden ser utilizadas en procesos de individualización de la HRTF, un modelo paramétrico del módulo de la HRTF y un ajuste por escalado de la Diferencia de Tiempo Interaural (Interaural Time Difference - ITD). En una segunda parte sobre reproducción con altavoces, se han evaluado distintas técnicas como la Síntesis de Campo de Ondas (Wave-Field Synthesis - WFS) o la panoramización por amplitud. Con experimentos perceptuales se han estudiado la capacidad de estos sistemas para producir sensación de distancia y la agudeza espacial con la que podemos percibir las fuentes sonoras si se dividen espectralmente y se reproducen en diferentes posiciones. Las aportaciones de esta investigación pretenden hacer más accesibles estas tecnologías al público en general, dada la demanda de experiencias y dispositivos audiovisuales que proporcionen mayor inmersión.[CA] La reproducció de les propietats espacials del so és una qüestió cada vegada més important en moltes aplicacions immersives emergents. Ja siga en la reproducció de contingut audiovisual en entorns domèstics o en cines, en sistemes de videoconferència immersius o en sistemes de realitat virtual o augmentada, el so espacial és crucial per a una sensació d'immersió realista. L'audició, més enllà de la física del so, és un fenomen perceptual influenciat per processos cognitius. L'objectiu d'aquesta tesi és contribuir a l'optimització i simplificació dels sistemes de so espacial amb nous mètodes i coneixement, des d'un criteri perceptual de l'experiència auditiva. Aquest treball tracta, en una primera part, alguns aspectes particulars relacionats amb la reproducció espacial binaural del so, com són l'audició amb auriculars i la personalització de la Funció de Transferència Relacionada amb el Cap (Head Related Transfer Function - HRTF). S'ha realitzat un estudi relacionat amb la influència dels auriculars en la percepció de la impressió espacial i la qualitat, dedicant especial atenció als efectes de l'equalització i la consegüent distorsió no lineal. Respecte a la individualització de la HRTF, es presenta una implementació completa d'un sistema de mesura de HRTF i s'inclou un nou mètode per a la mesura de HRTF en sales no anecoiques. A mès, s'han realitzat dos experiments diferents i complementaris que han donat com a resultat dues eines que poden ser utilitzades en processos d'individualització de la HRTF, un model paramètric del mòdul de la HRTF i un ajustament per escala de la Diferencià del Temps Interaural (Interaural Time Difference - ITD). En una segona part relacionada amb la reproducció amb altaveus, s'han avaluat distintes tècniques com la Síntesi de Camp d'Ones (Wave-Field Synthesis - WFS) o la panoramització per amplitud. Amb experiments perceptuals, s'ha estudiat la capacitat d'aquests sistemes per a produir una sensació de distància i l'agudesa espacial amb que podem percebre les fonts sonores, si es divideixen espectralment i es reprodueixen en diferents posicions. Les aportacions d'aquesta investigació volen fer més accessibles aquestes tecnologies al públic en general, degut a la demanda d'experiències i dispositius audiovisuals que proporcionen major immersió.[EN] The reproduction of the spatial properties of sound is an increasingly important concern in many emerging immersive applications. Whether it is the reproduction of audiovisual content in home environments or in cinemas, immersive video conferencing systems or virtual or augmented reality systems, spatial sound is crucial for a realistic sense of immersion. Hearing, beyond the physics of sound, is a perceptual phenomenon influenced by cognitive processes. The objective of this thesis is to contribute with new methods and knowledge to the optimization and simplification of spatial sound systems, from a perceptual approach to the hearing experience. This dissertation deals in a first part with some particular aspects related to the binaural spatial reproduction of sound, such as listening with headphones and the customization of the Head Related Transfer Function (HRTF). A study has been carried out on the influence of headphones on the perception of spatial impression and quality, with particular attention to the effects of equalization and subsequent non-linear distortion. With regard to the individualization of the HRTF a complete implementation of a HRTF measurement system is presented, and a new method for the measurement of HRTF in non-anechoic conditions is introduced. In addition, two different and complementary experiments have been carried out resulting in two tools that can be used in HRTF individualization processes, a parametric model of the HRTF magnitude and an Interaural Time Difference (ITD) scaling adjustment. In a second part concerning loudspeaker reproduction, different techniques such as Wave-Field Synthesis (WFS) or amplitude panning have been evaluated. With perceptual experiments it has been studied the capacity of these systems to produce a sensation of distance, and the spatial acuity with which we can perceive the sound sources if they are spectrally split and reproduced in different positions. The contributions of this research are intended to make these technologies more accessible to the general public, given the demand for audiovisual experiences and devices with increasing immersion.Gutiérrez Parera, P. (2020). Optimization and improvements in spatial sound reproduction systems through perceptual considerations [Tesis doctoral no publicada]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/142696TESI

    A distributed approach to surround sound production

    Get PDF
    The requirement for multi-channel surround sound in audio production applications is growing rapidly. Audio processing in these applications can be costly, particularly in multi-channel systems. A distributed approach is proposed for the development of a realtime spatialization system for surround sound music production, using Ambisonic surround sound methods. The latency in the system is analyzed, with a focus on the audio processing and network delays, in order to ascertain the feasibility of an enhanced, distributed real-time spatialization system

    Spatial auditory display for acoustics and music collections

    Get PDF
    PhDThis thesis explores how audio can be better incorporated into how people access information and does so by developing approaches for creating three-dimensional audio environments with low processing demands. This is done by investigating three research questions. Mobile applications have processor and memory requirements that restrict the number of concurrent static or moving sound sources that can be rendered with binaural audio. Is there a more e cient approach that is as perceptually accurate as the traditional method? This thesis concludes that virtual Ambisonics is an ef cient and accurate means to render a binaural auditory display consisting of noise signals placed on the horizontal plane without head tracking. Virtual Ambisonics is then more e cient than convolution of HRTFs if more than two sound sources are concurrently rendered or if movement of the sources or head tracking is implemented. Complex acoustics models require signi cant amounts of memory and processing. If the memory and processor loads for a model are too large for a particular device, that model cannot be interactive in real-time. What steps can be taken to allow a complex room model to be interactive by using less memory and decreasing the computational load? This thesis presents a new reverberation model based on hybrid reverberation which uses a collection of B-format IRs. A new metric for determining the mixing time of a room is developed and interpolation between early re ections is investigated. Though hybrid reverberation typically uses a recursive lter such as a FDN for the late reverberation, an average late reverberation tail is instead synthesised for convolution reverberation. Commercial interfaces for music search and discovery use little aural information even though the information being sought is audio. How can audio be used in interfaces for music search and discovery? This thesis looks at 20 interfaces and determines that several themes emerge from past interfaces. These include using a two or three-dimensional space to explore a music collection, allowing concurrent playback of multiple sources, and tools such as auras to control how much information is presented. A new interface, the amblr, is developed because virtual two-dimensional spaces populated by music have been a common approach, but not yet a perfected one. The amblr is also interpreted as an art installation which was visited by approximately 1000 people over 5 days. The installation maps the virtual space created by the amblr to a physical space

    The creation of a binaural spatialization tool

    Get PDF
    The main focus of the research presented within this thesis is, as the title suggests, binaural spatialization. Binaural technology and, especially, the binaural recording technique are not particu-larly recent. Nevertheless, the interest in this technology has lately become substantial due to the increase in the calculation power of personal computers, which started to allow the complete and accurate real-time simulation of three-dimensional sound-fields over headphones. The goals of this body of research have been determined in order to provide elements of novelty and of contribution to the state of the art in the field of binaural spatialization. A brief summary of these is found in the following list: • The development and implementation of a binaural spatialization technique with Distance Simulation, based on the individual simulation of the distance cues and Binaural Reverb, in turn based on the weighted mix between the signals convolved with the different HRIR and BRIR sets; • The development and implementation of a characterization process for modifying a BRIR set in order to simulate different environments with different characteristics in terms of frequency response and reverb time; • The creation of a real-time and offline binaural spatialization application, imple-menting the techniques cited in the previous points, and including a set of multichannel(and Ambisonics)-to-binaural conversion tools. • The performance of a perceptual evaluation stage to verify the effectiveness, realism, and quality of the techniques developed, and • The application and use of the developed tools within both scientific and artistic “case studies”. In the following chapters, sections, and subsections, the research performed between January 2006 and March 2010 will be described, outlining the different stages before, during, and after the development of the software platform, analysing the results of the perceptual evaluations and drawing conclusions that could, in the future, be considered the starting point for new and innovative research projects
    corecore