14,767 research outputs found
Psychophysiology-based QoE assessment : a survey
We present a survey of psychophysiology-based assessment for quality of experience (QoE) in advanced multimedia technologies. We provide a classification of methods relevant to QoE and describe related psychological processes, experimental design considerations, and signal analysis techniques. We summarize multimodal techniques and discuss several important aspects of psychophysiology-based QoE assessment, including the synergies with psychophysical assessment and the need for standardized experimental design. This survey is not considered to be exhaustive but serves as a guideline for those interested to further explore this emerging field of research
Content-prioritised video coding for British Sign Language communication.
Video communication of British Sign Language (BSL) is important for remote interpersonal communication and for the equal provision of services for deaf people. However, the use of video telephony and video conferencing applications for BSL communication is limited by inadequate video quality. BSL is a highly structured, linguistically complete, natural language system that expresses vocabulary and grammar visually and spatially using a complex combination of facial expressions (such as eyebrow movements, eye blinks and mouth/lip shapes), hand gestures, body movements and finger-spelling that change in space and time. Accurate natural BSL communication places specific demands on visual media applications which must compress video image data for efficient transmission. Current video compression schemes apply methods to reduce statistical redundancy and perceptual irrelevance in video image data based on a general model of Human Visual System (HVS) sensitivities. This thesis presents novel video image coding methods developed to achieve the conflicting requirements for high image quality and efficient coding. Novel methods of prioritising visually important video image content for optimised video coding are developed to exploit the HVS spatial and temporal response mechanisms of BSL users (determined by Eye Movement Tracking) and the characteristics of BSL video image content. The methods implement an accurate model of HVS foveation, applied in the spatial and temporal domains, at the pre-processing stage of a current standard-based system (H.264). Comparison of the performance of the developed and standard coding systems, using methods of video quality evaluation developed for this thesis, demonstrates improved perceived quality at low bit rates. BSL users, broadcasters and service providers benefit from the perception of high quality video over a range of available transmission bandwidths. The research community benefits from a new approach to video coding optimisation and better understanding of the communication needs of deaf people
Electrophysiologic assessment of (central) auditory processing disorder in children with non-syndromic cleft lip and/or palate
Session 5aPP - Psychological and Physiological Acoustics: Auditory Function, Mechanisms, and Models (Poster Session)Cleft of the lip and/or palate is a common congenital craniofacial malformation worldwide, particularly non-syndromic cleft lip and/or palate (NSCL/P). Though middle ear deficits in this population have been universally noted in numerous studies, other auditory problems including inner ear deficits or cortical dysfunction are rarely reported. A higher prevalence of educational problems has been noted in children with NSCL/P compared to craniofacially normal children. These high level cognitive difficulties cannot be entirely attributed to peripheral hearing loss. Recently it has been suggested that children with NSCLP may be more prone to abnormalities in the auditory cortex. The aim of the present study was to investigate whether school age children with (NSCL/P) have a higher prevalence of indications of (central) auditory processing disorder [(C)APD] compared to normal age matched controls when assessed using auditory event-related potential (ERP) techniques. School children (6 to 15 years) with NSCL/P and normal controls with matched age and gender were recruited. Auditory ERP recordings included auditory brainstem response and late event-related potentials, including the P1-N1-P2 complex and P300 waveforms. Initial findings from the present study are presented and their implications for further research in this area —and clinical intervention—are outlined. © 2012 Acoustical Society of Americapublished_or_final_versio
Assessing the quality of audio and video components in desktop multimedia conferencing
This thesis seeks to address the HCI (Human-Computer Interaction) research problem of how to establish the level of audio and video quality that end users require to successfully perform tasks via networked desktop videoconferencing. There are currently no established HCI methods of assessing the perceived quality of audio and video delivered in desktop videoconferencing. The transport of real-time speech and video information across new digital networks causes novel and different degradations, problems and issues to those common in the traditional telecommunications areas (telephone and television). Traditional assessment methods involve the use of very short test samples, are traditionally conducted outside a task-based environment, and focus on whether a degradation is noticed or not. But these methods cannot help establish what audio-visual quality is required by users to perform tasks successfully with the minimum of user cost, in interactive conferencing environments. This thesis addresses this research gap by investigating and developing a battery of assessment methods for networked videoconferencing, suitable for use in both field trials and laboratory-based studies. The development and use of these new methods helps identify the most critical variables (and levels of these variables) that affect perceived quality, and means by which network designers and HCI practitioners can address these problems are suggested. The output of the thesis therefore contributes both methodological (i.e. new rating scales and data-gathering methods) and substantive (i.e. explicit knowledge about quality requirements for certain tasks) knowledge to the HCI and networking research communities on the subjective quality requirements of real-time interaction in networked videoconferencing environments. Exploratory research is carried out through an interleaved series of field trials and controlled studies, advancing substantive and methodological knowledge in an incremental fashion. Initial studies use the ITU-recommended assessment methods, but these are found to be unsuitable for assessing networked speech and video quality for a number of reasons. Therefore later studies investigate and establish a novel polar rating scale, which can be used both as a static rating scale and as a dynamic continuous slider. These and further developments of the methods in future lab- based and real conferencing environments will enable subjective quality requirements and guidelines for different videoconferencing tasks to be established
Effects of errorless learning on the acquisition of velopharyngeal movement control
Session 1pSC - Speech Communication: Cross-Linguistic Studies of Speech Sound Learning of the Languages of Hong Kong (Poster Session)The implicit motor learning literature suggests a benefit for learning if errors are minimized during practice. This study investigated whether the same principle holds for learning velopharyngeal movement control. Normal speaking participants learned to produce hypernasal speech in either an errorless learning condition (in which the possibility for errors was limited) or an errorful learning condition (in which the possibility for errors was not limited). Nasality level of the participants’ speech was measured by nasometer and reflected by nasalance scores (in %). Errorless learners practiced producing hypernasal speech with a threshold nasalance score of 10% at the beginning, which gradually increased to a threshold of 50% at the end. The same set of threshold targets were presented to errorful learners but in a reversed order. Errors were defined by the proportion of speech with a nasalance score below the threshold. The results showed that, relative to errorful learners, errorless learners displayed fewer errors (50.7% vs. 17.7%) and a higher mean nasalance score (31.3% vs. 46.7%) during the acquisition phase. Furthermore, errorless learners outperformed errorful learners in both retention and novel transfer tests. Acknowledgment: Supported by The University of Hong Kong Strategic Research Theme for Sciences of Learning © 2012 Acoustical Society of Americapublished_or_final_versio
Recommended from our members
Real-Time Electroencephalogram Sonification for Neurofeedback
Electroencephalography (EEG) is the measurement via the scalp of the electrical activity of the brain. The established therapeutic intervention of neurofeedback involves presenting people with their own EEG in real-time to enable them to modify their EEG for purposes of improving performance or health.
The aim of this research is to develop and validate real-time sonifications of EEG for use in neurofeedback and methods for assessing such sonifications. Neurofeedback generally uses a visual display. Where auditory feedback is used, it is mostly limited to pre-recorded sounds triggered by the EEG activity crossing a threshold. However, EEG generates time-series data with meaningful detail at fine temporal resolution and with complex temporal dynamics. Human hearing has a much higher temporal resolution than human vision, and auditory displays do not require people to focus on a screen with their eyes open for extended periods of time – e.g. if they are engaged in some other task. Sonification of EEG could allow more rapid, contingent, salient and temporally detailed feedback. This could improve the efficiency of neurofeedback training and reduce the number and duration of sessions for successful neurofeedback.
The same two deliberately simple sonification techniques were used in all three experiments of this research: Amplitude Modulation (AM) sonification, which maps the fluctuations in the power of the EEG to the volume of a pure tone; and Frequency Modulation (FM) sonification, which uses the changes in the EEG power to modify the frequency. Measures included, a listening task, NASA task load index; a measure of how much work it was to do the task, Pre & post measures of mood, and EEG.
The first experiment used pre-recorded single channel EEG and participants were asked to listen to the sound of the sonified EEG and try and track the activity that they could hear by moving a slider on a computer screen using a computer mouse. This provided a quantitative assessment of how well people could perceive the sonified fluctuations in EEG level. The tracking accuracy scores were higher for the FM sonification but self-assessments of task load rated the AM sonification as easier to track.
The second experiment used the same two sonifications, in a real neurofeedback task using participants own live EEG. Unbeknownst to the participants the neurofeedback task was designed to improve mood. A Pre-Post questionnaire showed that participants changed their self-rated mood in the intended direction with the EEG training, but there was no statistically significant change in EEG. Again the FM sonification showed a better performance but AM was rated as less effortful. The performance of sonifications in the tracking task in experiment 1 was found to predict their relative efficacy at blind self-rated mood modification in experiment 2.
The third experiment used both the tracking as in experiment 1 and neurofeedback tasks as in experiment 2, but with modified versions of the AM and FM sonifications to allow two-channel EEG sonifications. This experiment introduced a physical slider as opposed to a mouse for the tracking task. Tracking accuracy increased, but this time no significant difference was found between the two sonification techniques on the tracking task. In the training task, once more the blind self-rated mood did improve in the intended direction with the EEG training, but as again there was no significant change in EEG, this cannot necessarily be attributed to the neurofeedback. There was only a slight difference between the two sonification techniques in the effort measure.
In this way, a prototype method has been devised and validated for the quantitative assessment of real-time EEG sonifications. Conventional evaluations of neurofeedback techniques are expensive and time consuming. By contrast, this method potentially provides a rapid, objective and efficient method for evaluating the suitability of candidate sonifications for EEG neurofeedback
Augmenting Sonic Experiences Through Haptic Feedback
Sonic experiences are usually considered as the result of auditory feedback alone. From a psychological standpoint, however, this is true only when a listener is kept isolated from concurrent stimuli targeting the other senses. Such stimuli, in fact, may either interfere with the sonic experience if they distract the listener, or conversely enhance it if they convey sensations coherent with what is being heard. This chapter is concerned with haptic augmentations having effects on auditory perception, for example how different vibrotactile cues provided by an electronic musical instrument may affect its perceived sound quality or the playing experience. Results from different experiments are reviewed showing that the auditory and somatosensory channels together can produce constructive effects resulting in measurable perceptual enhancement. That may affect sonic dimensions ranging from basic auditory parameters, such as the perceived intensity of frequency components, up to more complex perceptions which contribute to forming our ecology of everyday or musical sounds
- …