7,105 research outputs found

    Sound Synthesis with Auditory Distortion Products

    Get PDF
    This article describes methods of sound synthesis based on auditory distortion products, often called combination tones. In 1856, Helmholtz was the first to identify sum and difference tones as products of auditory distortion. Today this phenomenon is well studied in the context of otoacoustic emissions, and the “distortion” is understood as a product of what is termed the cochlear amplifier. These tones have had a rich history in the music of improvisers and drone artists. Until now, the use of distortion tones in technological music has largely been rudimentary and dependent on very high amplitudes in order for the distortion products to be heard by audiences. Discussed here are synthesis methods to render these tones more easily audible and lend them the dynamic properties of traditional acoustic sound, thus making auditory distortion a practical domain for sound synthesis. An adaptation of single-sideband synthesis is particularly effective for capturing the dynamic properties of audio inputs in real time. Also presented is an analytic solution for matching up to four harmonics of a target spectrum. Most interestingly, the spatial imagery produced by these techniques is very distinctive, and over loudspeakers the normal assumptions of spatial hearing do not apply. Audio examples are provided that illustrate the discussion

    A convolutional neural-network model of human cochlear mechanics and filter tuning for real-time applications

    Full text link
    Auditory models are commonly used as feature extractors for automatic speech-recognition systems or as front-ends for robotics, machine-hearing and hearing-aid applications. Although auditory models can capture the biophysical and nonlinear properties of human hearing in great detail, these biophysical models are computationally expensive and cannot be used in real-time applications. We present a hybrid approach where convolutional neural networks are combined with computational neuroscience to yield a real-time end-to-end model for human cochlear mechanics, including level-dependent filter tuning (CoNNear). The CoNNear model was trained on acoustic speech material and its performance and applicability were evaluated using (unseen) sound stimuli commonly employed in cochlear mechanics research. The CoNNear model accurately simulates human cochlear frequency selectivity and its dependence on sound intensity, an essential quality for robust speech intelligibility at negative speech-to-background-noise ratios. The CoNNear architecture is based on parallel and differentiable computations and has the power to achieve real-time human performance. These unique CoNNear features will enable the next generation of human-like machine-hearing applications

    Playing the Ear: Non-Linearities of the Inner Ear and their Creative Potential

    Get PDF
    This thesis concerns the application of psychoacoustic phenomena relating to the non-linear nature of the inner ear as an electroacoustic compositional tool. The compositions included in this portfolio explore the validity of a variety of non-linear inner ear phenomena within composition by employing them as primary compositional devices. Psychoacoustics research into the non-linearities of the inner ear has proven that the inner ear has much more to offer the composer than has been previously considered. By reversing the role of the ear from, what Christopher Haworth describes as, 'being a submissive receiver',13 to becoming an active participant in the creative process, an exciting level of opportunity opens up for both the composer and listener. A focus is given in this research to auditory distortion products and bandwidth phenomena with references to the author's own compositional material. While it is relatively common for composers to have explored various elements of psychoacoustics in their work, a project of this size, which explicitly explores such material, has not been carried out until now. The work of Maryanne Amacher, Alvin Lucier, Diana Deutsch, and others has highlighted the possibilities of employing psychoacoustic principles in music. This research takes a new approach by placing a direct focus on the benefits of the utilisation of these non-linear mechanisms of the inner ear for the composer, while also positing a number of new creative methodologies with respect to the non-linearities of the inner ear

    Mouse Panx1 Is Dispensable for Hearing Acquisition and Auditory Function

    Get PDF
    Panx1 forms plasma membrane channels in brain and several other organs, including the inner ear. Biophysical properties, activation mechanisms and modulators of Panx1 channels have been characterized in detail, however the impact of Panx1 on auditory function is unclear due to conflicts in published results. To address this issue, hearing performance and cochlear function of the Panx1−/− mouse strain, the first with a reported global ablation of Panx1, were scrutinized. Male and female homozygous (Panx1−/−), hemizygous (Panx1+/−) and their wild type (WT) siblings (Panx1+/+) were used for this study. Successful ablation of Panx1 was confirmed by RT-PCR and Western immunoblotting in the cochlea and brain of Panx1−/− mice. Furthermore, a previously validated Panx1-selective antibody revealed strong immunoreactivity in WT but not in Panx1−/− cochleae. Hearing sensitivity, outer hair cell-based “cochlear amplifier” and cochlear nerve function, analyzed by auditory brainstem response (ABR) and distortion product otoacoustic emission (DPOAE) recordings, were normal in Panx1+/− and Panx1−/− mice. In addition, we determined that global deletion of Panx1 impacts neither on connexin expression, nor on gap-junction coupling in the developing organ of Corti. Finally, spontaneous intercellular Ca2+ signal (ICS) activity in organotypic cochlear cultures, which is key to postnatal development of the organ of Corti and essential for hearing acquisition, was not affected by Panx1 ablation. Therefore, our results provide strong evidence that, in mice, Panx1 is dispensable for hearing acquisition and auditory function

    Anthropomorphic Coding of Speech and Audio: A Model Inversion Approach

    Get PDF
    Auditory modeling is a well-established methodology that provides insight into human perception and that facilitates the extraction of signal features that are most relevant to the listener. The aim of this paper is to provide a tutorial on perceptual speech and audio coding using an invertible auditory model. In this approach, the audio signal is converted into an auditory representation using an invertible auditory model. The auditory representation is quantized and coded. Upon decoding, it is then transformed back into the acoustic domain. This transformation converts a complex distortion criterion into a simple one, thus facilitating quantization with low complexity. We briefly review past work on auditory models and describe in more detail the components of our invertible model and its inversion procedure, that is, the method to reconstruct the signal from the output of the auditory model. We summarize attempts to use the auditory representation for low-bit-rate coding. Our approach also allows the exploitation of the inherent redundancy of the human auditory system for the purpose of multiple description (joint source-channel) coding

    Embedding Distance Information in Binaural Renderings of Far Field Recordings

    Get PDF
    Traditional representations of sound fields based on spherical harmonics expansions do not include the sound source distance information. As multipole expansions can accurately encode the distance of a sound source, they can be used for accurate sound field reproduction. The binaural reproduction of multipole encodings, though, requires head-related transfer functions (HRTFs) with distance information. However, the inclusion of distance information on available data sets of HRTFs, using acoustic propagators, requires demanding regularization techniques. We alternatively propose a method to embed distance information in the spherical harmonics encodings of compact microphone array recordings. We call this method the Distance Editing Binaural Ambisonics (DEBA). DEBA is applied to the synthesis of binaural signals of arbitrary distances using only far-field HRTFs. We evaluated DEBA by synthesizing HRTFs for nearby sources from various samplings of far-field ones. Comparisons with numerically calculated HRTFs yielded mean spectral distortion values below 6 dB, and mean normalized spherical correlation values above 0.97
    corecore