Search CORE

1,775 research outputs found

Distortions of Subjective Time Perception Within and Across Senses

Author: A Bendixen
A Johnston
A Pascual-Leone
BA Wright
D Alais
D Brainard
D Buonomano
D Burr
D Pelli
D Rose
D Zakay
David Eagleman
Dean V. Buonomano
DM Eagleman
DW Massaro
E Pöppel
E Pöppel
G Oléron
G Westheimer
GH Recanzone
I Hodinott-Hill
J Bresciani
J Gebhard
J Park
J Ross
JH Wearden
JT Coull
JT Coull
JX Maier
K N'Diaye
K Watanabe
K Yarrow
K Yarrow
KP Kording
L Allan
L Battelli
L Shams
Ladan Shams
M Dhamala
M Treisman
M Wittmann
MC Morrone
MD Mauk
MI Leon
MJ Penner
MO Ernst
MO Ernst
N Roach
P Fraisse
PA Lewis
PU Tse
R Efron
R Kanai
R Plomp
R van Beers
R Welch
RA Block
RB Ivry
S Denève
S Droit-Volet
S Droit-Volet
S Goldstone
SE Guttman
Shinsuke Shimojo
SK Rosahl
SL Franconeri
SL Franconeri
SS Nagarajan
T Shipley
TB Penney
TB Penney
UR Karmarkar
UR Karmarkar
V Pariyadath
V van Wassenhove
Virginie van Wassenhove
W Ball
W James
W Schiff
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2008
Field of study

Background: The ability to estimate the passage of time is of fundamental importance for perceptual and cognitive processes. One experience of time is the perception of duration, which is not isomorphic to physical duration and can be distorted by a number of factors. Yet, the critical features generating these perceptual shifts in subjective duration are not understood. Methodology/Findings: We used prospective duration judgments within and across sensory modalities to examine the effect of stimulus predictability and feature change on the perception of duration. First, we found robust distortions of perceived duration in auditory, visual and auditory-visual presentations despite the predictability of the feature changes in the stimuli. For example, a looming disc embedded in a series of steady discs led to time dilation, whereas a steady disc embedded in a series of looming discs led to time compression. Second, we addressed whether visual (auditory) inputs could alter the perception of duration of auditory (visual) inputs. When participants were presented with incongruent audio-visual stimuli, the perceived duration of auditory events could be shortened or lengthened by the presence of conflicting visual information; however, the perceived duration of visual events was seldom distorted by the presence of auditory information and was never perceived shorter than their actual durations. Conclusions/Significance: These results support the existence of multisensory interactions in the perception of duration and, importantly, suggest that vision can modify auditory temporal perception in a pure timing task. Insofar as distortions in subjective duration can neither be accounted for by the unpredictability of an auditory, visual or auditory-visual event, we propose that it is the intrinsic features of the stimulus that critically affect subjective time distortions

CiteSeerX

Crossref

Directory of Open Access Journals

PubMed Central

eScholarship - University of California

Caltech Authors

Categorisation of distortion profiles in relation to audio quality

Author: Fazenda BM
Wilson AD
Publication venue
Publication date: 01/09/2014
Field of study

Since digital audio is encoded as discrete samples of the audio waveform, much can be said about a recording by the statistical properties of these samples. In this paper, a dataset of CD audio samples is analysed; the probability mass function of each audio clip informs a feature set which describes attributes of the musical recording related to loudness, dynamics and distortion. This allows musical recordings to be classiﬁed according to their “distortion character”, a concept which describes the nature of amplitude distortion in mastered audio. A subjective test was designed in which such recordings were rated according to the perception of their audio quality. It is shown that participants can discern between three different distortion characters; ratings of audio quality were signiﬁcantly different (F(1; 2) = 5:72; p < 0:001; eta^2 = 0:008) as were the words used to describe the attributes on which quality was assessed (�Chi^2(8; N = 547) = 33:28; p < 0:001).This expands upon previous work showing links between the effects of dynamic range compression and audio quality in musical recordings, by highlighting perceptual differences

CiteSeerX

University of Salford Institutional Repository

Recommended from our members

Effects of sound-induced hearing loss and hearing AIDS on the perception of music

Author: Moore BCJ
Publication venue: AES: Journal of the Audio Engineering Society
Publication date: 01/01/2016
Field of study

This is the final version of the article. It first appeared from the Audio Engineering Society via https://doi.org/10.17743/jaes.2015.0081Exposure to high-level music produces several physiological changes in the auditory system that lead to a variety of perceptual effects. Damage to the outer hair cells within the cochlea leads to a loss of sensitivity to weak sounds, loudness recruitment (a more rapid than normal growth of loudness with increasing sound level) and reduced frequency selectivity. Damage to inner hair cells and/or synapses leads to degeneration of neurons in the auditory nerve and to a reduced flow of information to the brain. This leads to poorer auditory discrimination and may contribute to reduced sensitivity to the temporal fine structure of sounds and to poor pitch perception. Hearing aids compensate for the effects of threshold elevation and loudness recruitment via multi-channel amplitude compression, but they do not compensate for reduced frequency selectivity or loss of inner hair cells/synapses/neurons. Multi-channel compression can impair some aspects of the perception of music, such as the ability to hear out one instrument or voice from a mixture. The limited frequency range and irregular frequency response of most hearing aids is associated with poor sound quality for music. Finally, systems for reducing acoustic feedback can have undesirable side effects when listening to music.This work was supported by the Medical Research Council (UK, grant number G0701870), Action on Hearing Loss, Phonak, and Starkey

Apollo (Cambridge)

Scalable and perceptual audio compression

Author: Raad Mohammed
Publication venue: School of Electrical, Computer and Telecommunications Engineering
Publication date: 01/01/2003
Field of study

This thesis deals with scalable perceptual audio compression. Two scalable perceptual solutions as well as a scalable to lossless solution are proposed and investigated. One of the scalable perceptual solutions is built around sinusoidal modelling of the audio signal whilst the other is built on a transform coding paradigm. The scalable coders are shown to scale both in a waveform matching manner as well as a psychoacoustic manner. In order to measure the psychoacoustic scalability of the systems investigated in this thesis, the similarity between the original signal\u27s psychoacoustic parameters and that of the synthesized signal are compared. The psychoacoustic parameters used are loudness, sharpness, tonahty and roughness. This analysis technique is a novel method used in this thesis and it allows an insight into the perceptual distortion that has been introduced by any coder analyzed in this manner

Research Online

Intelligent Tools for Multitrack Frequency and Dynamics Processing

Author: Ma Zheng
Publication venue: 'Queen Mary University of London'
Publication date: 23/05/2017
Field of study

PhDThis research explores the possibility of reproducing mixing decisions of a skilled audio engineer with minimal human interaction that can improve the overall listening experience of musical mixtures, i.e., intelligent mixing. By producing a balanced mix automatically musician and mixing engineering can focus on their creativity while the productivity of music production is increased. We focus on the two essential aspects of such a system, frequency and dynamics. This thesis presents an intelligent strategy for multitrack frequency and dynamics processing that exploit the interdependence of input audio features, incorporates best practices in audio engineering, and driven by perceptual models and subjective criteria. The intelligent frequency processing research begins with a spectral characteristic analysis of commercial recordings, where we discover a consistent leaning towards a target equalization spectrum. A novel approach for automatically equalizing audio signals towards the observed target spectrum is then described and evaluated. We proceed to dynamics processing, and introduce an intelligent multitrack dynamic range compression algorithm, in which various audio features are proposed and validated to better describe the transient nature and spectral content of the signals. An experiment to investigate the human preference on dynamic processing is described to inform our choices of parameter automations. To provide a perceptual basis for the intelligent system, we evaluate existing perceptual models, and propose several masking metrics to quantify the masking behaviour within the multitrack mixture. Ultimately, we integrate previous research on auditory masking, frequency and dynamics processing, into one intelligent system of mix optimization that replicates the iterative process of human mixing. Within the system, we explore the relationship between equalization and dynamics processing, and propose a general frequency and dynamics processing framework. Various implementations of the intelligent system are explored and evaluated objectively and subjectively through listening experiments.China Scholarship Council

Queen Mary Research Online

Tracking cortical entrainment in neural activity: auditory processes in human temporal cortex.

Author: Buttery Paula
Fonteneau Elisabeth
Marslen-Wilson William D
Nimmo-Smith Ian
Patterson Roy D
Thwaites Andrew
Publication venue: Front Comput Neurosci
Publication date: 01/01/2015
Field of study

A primary objective for cognitive neuroscience is to identify how features of the sensory environment are encoded in neural activity. Current auditory models of loudness perception can be used to make detailed predictions about the neural activity of the cortex as an individual listens to speech. We used two such models (loudness-sones and loudness-phons), varying in their psychophysiological realism, to predict the instantaneous loudness contours produced by 480 isolated words. These two sets of 480 contours were used to search for electrophysiological evidence of loudness processing in whole-brain recordings of electro- and magneto-encephalographic (EMEG) activity, recorded while subjects listened to the words. The technique identified a bilateral sequence of loudness processes, predicted by the more realistic loudness-sones model, that begin in auditory cortex at ~80 ms and subsequently reappear, tracking progressively down the superior temporal sulcus (STS) at lags from 230 to 330 ms. The technique was then extended to search for regions sensitive to the fundamental frequency (F0) of the voiced parts of the speech. It identified a bilateral F0 process in auditory cortex at a lag of ~90 ms, which was not followed by activity in STS. The results suggest that loudness information is being used to guide the analysis of the speech stream as it proceeds beyond auditory cortex down STS toward the temporal pole.This work was supported by an EPSRC grant to William D. Marslen-Wilson and Paula Buttery (EP/F030061/1), an ERC Advanced Grant (Neurolex) to William D. Marslen-Wilson, and by MRC Cognition and Brain Sciences Unit (CBU) funding to William D. Marslen-Wilson (U.1055.04.002.00001.01). Computing resources were provided by the MRC-CBU and the University of Cambridge High Performance Computing Service (http://www.hpc.cam.ac.uk/). Andrew Liu and Phil Woodland helped with the HTK speech recogniser and Russell Thompson with the Matlab code. We thank Asaf Bachrach, Cai Wingfield, Isma Zulfiqar, Alex Woolgar, Jonathan Peelle, Li Su, Caroline Whiting, Olaf Hauk, Matt Davis, Niko Kriegeskorte, Paul Wright, Lorraine Tyler, Rhodri Cusack, Brian Moore, Brian Glasberg, Rik Henson, Howard Bowman, Hideki Kawahara, and Matti Stenroos for invaluable support and suggestions.This is the final published version. The article was originally published in Frontiers in Computational Neuroscience, 10 February 2015 | doi: 10.3389/fncom.2015.0000

Frontiers - Publisher Connector

PubMed Central

Apollo (Cambridge)

Applications of loudness models in audio engineering

Author: Ward Dominic
Publication venue
Publication date: 17/08/2017
Field of study

This thesis investigates the application of perceptual models to areas of audio engineering, with a particular focus on music production. The goal was to establish efficient and practical tools for the measurement and control of the perceived loudness of musical sounds. Two types of loudness model were investigated: the single-band model and the multiband excitation pattern (EP) model. The heuristic single-band devices were designed to be simple but sufficiently effective for real-world application, whereas the multiband procedures were developed to give a reasonable account of a large body of psychoacoustic findings according to a functional model of the peripheral hearing system. The research addresses the extent to which current models of loudness generalise to musical instruments, and whether can they be successfully employed in music applications. The domain-specific disparity between the two types of model was first tackled by reducing the computational load of state-of-the-art EP models to allow for fast but low-error auditory signal processing. Two elaborate hearing models were analysed and optimised using musical instruments and speech as test stimuli. It was shown that, after significantly reducing the complexity of both procedures, estimates of global loudness, such as peak loudness, as well as the intermediate auditory representations can be preserved with high accuracy. Based on the optimisations, two real-time applications were developed: a binaural loudness meter and an automatic multitrack mixer. This second system was designed to work independently of the loudness measurement procedure, and therefore supports both linear and nonlinear models. This allowed for a single mixing device to be assessed using different loudness metrics and this was demonstrated by evaluating three configurations through subjective assessment. Unexpectedly, when asked to rate both the overall quality of a mix and the degree to which instruments were equally loud, listeners preferred mixes generated using heuristic single-band models over those produced using a multiband procedure. A series of more systematic listening tests were conducted to further investigate this finding. Subjective loudness matches of musical instruments commonly found in western popular music were collected to evaluate the performance of five published models. The results were in accord with the application-based assessment, namely that current EP procedures do not generalise well when estimating the relative loudness of musical sounds which have marked differences in spectral content. Model specific issues were identified relating to the calculation of spectral loudness summation (SLS) and the method used to determine the global-loudness percept of time-varying musical sounds; associated refinements were proposed. It was shown that a new multiband loudness model with a heuristic loudness transformation yields superior performance over existing methods. This supports the idea that a revised model of SLS is needed, and therefore that modification to this stage in existing psychoacoustic procedures is an essential step towards the goal of achieving real-world deployment

Birmingham City University Open Access Repository

BCU Open Access

Interaction of Working Memory, Compressor Speed and Background Noise Characteristics

Author: MacDonald Ewen
Ohlenforst Barbara
Souza Pamela
Publication venue
Publication date: 01/01/2014
Field of study

Online Research Database In Technology