1,627 research outputs found
Validating a real-time perceptual model predicting distraction caused by audio-on-audio interference
Twenty years of load theory—Where are we now, and where should we go next?
Selective attention allows us to ignore what is task-irrelevant and focus on what is task-relevant. The cognitive and neural mechanisms that underlie this process are key topics of investigation in cognitive psychology. One of the more prominent theories of attention is perceptual load theory, which suggests that the efficiency of selective attention is dependent on both perceptual and cognitive load. It is now more than 20 years since the proposal of load theory, and it is a good time to evaluate the evidence in support of this influential model. The present article supplements and extends previous reviews (Lavie, Trends in Cognitive Sciences, 9, 75–82. doi:10.​1016/​j.​tics.​2004.​12.​004, 2005, Current Directions in Psychological Science, 19, 143–148. doi:10.​1177/​0963721410370295​, 2010) by examining more recent research in what appears to be a rapidly expanding area. The article comprises five parts, examining (1) evidence for the effects of perceptual load on attention, (2) cognitive load, (3) individual differences under load, (4) alternative theories and criticisms, and (5) the future of load theory. We argue that the key next step for load theory will be the application of the model to real-world tasks. The potential benefits of applied attention research are numerous, and there is tentative evidence that applied research would provide strong support for the theory itself, as well as real-world benefits related to activities in which attention is crucial, such as driving and education
OVERLAPPED-SPEECH DETECTION WITH APPLICATIONS TO DRIVER ASSESSMENT FOR IN-VEHICLE ACTIVE SAFETY SYSTEMS
ABSTRACT In this study we propose a system for overlapped-speech detection. Spectral harmonicity and envelope features are extracted to represent overlapped and single-speaker speech using Gaussian mixture models (GMM). The system is shown to effectively discriminate the single and overlapped speech classes. We further increase the discrimination by proposing a phoneme selection scheme to generate more reliable artificial overlapped data for model training. Evaluations on artificially generated co-channel data show that the novelty in feature selection and phoneme omission results in a relative improvement of 10% in the detection accuracy compared to baseline. As an example application, we evaluate the effectiveness of overlapped-speech detection for vehicular environments and its potential in assessing driver alertness. Results indicate a good correlation between driver performance and the amount and location of overlapped-speech segments
Recommended from our members
Listening under pressure : the downside of motivation
The desire for self-improvement is critical to human performance and learning outcomes. Paradoxically, however, being subjected to increased performance pressure can also result in “choking under pressure”. No studies have experimentally examined the extent to which motivation impacts native speech processing. This dissertation manipulated performance pressure in listeners, and systematically examined its impact on three speech-processing experiments. Sixty adult native English listeners and 45 non-native listeners with poorer English proficiency completed three speech processing experiments, twice – once to establish a baseline, and again to measure changes in performance. In these experiments using native English speech, listeners detected (illusionary) sound changes, categorized phonemes under lexical interference, and recognized words in noises. After baseline testing, half of the participants in each language group were instructed to work, with a fictitious partner, towards a performance-contingent monetary reward; the other half, as controls, simply performed the tasks a second time. This study demonstrated a negative impact of performance pressure on native listeners in all experiments. Relative to the controls, the motivation group were more susceptible to illusions, failed to ignore lexical interference despite prior exposure, and recognized fewer words in cognitively-demanding listening situations. Unexpectedly, relative to native listeners, non-native listeners perceived it as less important to perform well, and those who were in the high performance-pressure group requested significantly greater amount of money for improvement. These language-group differences in task-related attitudes might be a confounding factor that moderate the effect of motivation. By illustrating a complex interaction among motivation, listener status, and performance-induced demands, this dissertation highlights the importance of motivation in speech science.Communication Sciences and Disorder
Auditory-visual interaction in computer graphics
Generating high-fidelity images in real-time at reasonable frame rates, still remains one of the main challenges in computer graphics. Furthermore, visuals
remain only one of the multiple sensory cues that are required to be delivered
simultaneously in a multi-sensory virtual environment. The most frequently used
sense, besides vision, in virtual environments and entertainment, is audio. While
the rendering community focuses on solving the rendering equation more quickly
using various algorithmic and hardware improvements, the exploitation of human
limitations to assist in this process remain largely unexplored.
Many findings in the research literature prove the existence of physical and
psychological limitations of humans, including attentional, perceptual and limitations of the Human Sensory System (HSS). Knowledge of the Human Visual
System (HVS) may be exploited in computer graphics to significantly reduce
rendering times without the viewer being aware of any resultant image quality
difference. Furthermore, cross-modal effects, that is the influence of one sensory
input on another, for example sound and visuals, have also recently been shown
to have a substantial impact on viewer perception of virtual environment.
In this thesis, auditory-visual cross-modal interaction research findings have
been investigated and adapted to graphics rendering purposes. The results from
five psychophysical experiments, involving 233 participants, showed that, even in
the realm of computer graphics, there is a strong relationship between vision and
audition in both spatial and temporal domains. The first experiment, investigating the auditory-visual cross-modal interaction within spatial domain, showed
that unrelated sound effects reduce perceived rendering quality threshold. In
the following experiments, the effect of audio on temporal visual perception was
investigated. The results obtained indicate that audio with certain beat rates
can be used in order to reduce the amount of rendering required to achieve a
perceptual high quality. Furthermore, introducing the sound effect of footsteps
to walking animations increased the visual smoothness perception. These results
suggest that for certain conditions the number of frames that need to be rendered each second can be reduced, saving valuable computation time, without
the viewer being aware of this reduction. This is another step towards a comprehensive understanding of auditory-visual cross-modal interaction and its use in
high-fidelity interactive multi-sensory virtual environments
- …