2,943 research outputs found

    Speaker Normalization Using Cortical Strip Maps: A Neural Model for Steady State vowel Categorization

    Full text link
    Auditory signals of speech are speaker-dependent, but representations of language meaning are speaker-independent. The transformation from speaker-dependent to speaker-independent language representations enables speech to be learned and understood from different speakers. A neural model is presented that performs speaker normalization to generate a pitch-independent representation of speech sounds, while also preserving information about speaker identity. This speaker-invariant representation is categorized into unitized speech items, which input to sequential working memories whose distributed patterns can be categorized, or chunked, into syllable and word representations. The proposed model fits into an emerging model of auditory streaming and speech categorization. The auditory streaming and speaker normalization parts of the model both use multiple strip representations and asymmetric competitive circuits, thereby suggesting that these two circuits arose from similar neural designs. The normalized speech items are rapidly categorized and stably remembered by Adaptive Resonance Theory circuits. Simulations use synthesized steady-state vowels from the Peterson and Barney [J. Acoust. Soc. Am. 24, 175-184 (1952)] vowel database and achieve accuracy rates similar to those achieved by human listeners. These results are compared to behavioral data and other speaker normalization models.National Science Foundation (SBE-0354378); Office of Naval Research (N00014-01-1-0624

    Engineering data compendium. Human perception and performance. User's guide

    Get PDF
    The concept underlying the Engineering Data Compendium was the product of a research and development program (Integrated Perceptual Information for Designers project) aimed at facilitating the application of basic research findings in human performance to the design and military crew systems. The principal objective was to develop a workable strategy for: (1) identifying and distilling information of potential value to system design from the existing research literature, and (2) presenting this technical information in a way that would aid its accessibility, interpretability, and applicability by systems designers. The present four volumes of the Engineering Data Compendium represent the first implementation of this strategy. This is the first volume, the User's Guide, containing a description of the program and instructions for its use

    A psychology literature study on modality related issues for multimodal presentation in crisis management

    Get PDF
    The motivation of this psychology literature study is to obtain modality related guidelines for real-time information presentation in crisis management environment. The crisis management task is usually companied by time urgency, risk, uncertainty, and high information density. Decision makers (crisis managers) might undergo cognitive overload and tend to show biases in their performances. Therefore, the on-going crisis event needs to be presented in a manner that enhances perception, assists diagnosis, and prevents cognitive overload. To this end, this study looked into the modality effects on perception, cognitive load, working memory, learning, and attention. Selected topics include working memory, dual-coding theory, cognitive load theory, multimedia learning, and attention. The findings are several modality usage guidelines which may lead to more efficient use of the user’s cognitive capacity and enhance the information perception

    DISSOCIABLE MECHANISMS OF CONCURRENT SPEECH IDENTIFICATION IN NOISE AT CORTICAL AND SUBCORTICAL LEVELS.

    Get PDF
    When two vowels with different fundamental frequencies (F0s) are presented concurrently, listeners often hear two voices producing different vowels on different pitches. Parsing of this simultaneous speech can also be affected by the signal-to-noise ratio (SNR) in the auditory scene. The extraction and interaction of F0 and SNR cues may occur at multiple levels of the auditory system. The major aims of this dissertation are to elucidate the neural mechanisms and time course of concurrent speech perception in clean and in degraded listening conditions and its behavioral correlates. In two complementary experiments, electrical brain activity (EEG) was recorded at cortical (EEG Study #1) and subcortical (FFR Study #2) levels while participants heard double-vowel stimuli whose fundamental frequencies (F0s) differed by zero and four semitones (STs) presented in either clean or noise degraded (+5 dB SNR) conditions. Behaviorally, listeners were more accurate in identifying both vowels for larger F0 separations (i.e., 4ST; with pitch cues), and this F0-benefit was more pronounced at more favorable SNRs. Time-frequency analysis of cortical EEG oscillations (i.e., brain rhythms) revealed a dynamic time course for concurrent speech processing that depended on both extrinsic (SNR) and intrinsic (pitch) acoustic factors. Early high frequency activity reflected pre-perceptual encoding of acoustic features (~200 ms) and the quality (i.e., SNR) of the speech signal (~250-350ms), whereas later-evolving low-frequency rhythms (~400-500ms) reflected post-perceptual, cognitive operations that covaried with listening effort and task demands. Analysis of subcortical responses indicated that while FFRs provided a high-fidelity representation of double vowel stimuli and the spectro-temporal nonlinear properties of the peripheral auditory system. FFR activity largely reflected the neural encoding of stimulus features (exogenous coding) rather than perceptual outcomes, but timbre (F1) could predict the speed in noise conditions. Taken together, results of this dissertation suggest that subcortical auditory processing reflects mostly exogenous (acoustic) feature encoding in stark contrast to cortical activity, which reflects perceptual and cognitive aspects of concurrent speech perception. By studying multiple brain indices underlying an identical task, these studies provide a more comprehensive window into the hierarchy of brain mechanisms and time-course of concurrent speech processing

    Saccade frequency response to visual cues during gait in Parkinson's disease: the selective role of attention

    Get PDF
    Gait impairment is a core feature of Parkinson's disease (PD) with implications for falls risk. Visual cues improve gait in PD, but the underlying mechanisms are unclear. Evidence suggests that attention and vision play an important role; however, the relative contribution from each is unclear. Measurement of visual exploration (specifically saccade frequency) during gait allows for real-time measurement of attention and vision. Understanding how visual cues influence visual exploration may allow inferences of the underlying mechanisms to response which could help to develop effective therapeutics. This study aimed to examine saccade frequency during gait in response to a visual cue in PD and older adults and investigate the roles of attention and vision in visual cue response in PD. A mobile eye-tracker measured saccade frequency during gait in 55 people with PD and 32 age-matched controls. Participants walked in a straight line with and without a visual cue (50 cm transverse lines) presented under single task and dual-task (concurrent digit span recall). Saccade frequency was reduced when walking in PD compared to controls; however, visual cues ameliorated saccadic deficit. Visual cues significantly increased saccade frequency in both PD and controls under both single task and dual-task. Attention rather than visual function was central to saccade frequency and gait response to visual cues in PD. In conclusion, this study highlights the impact of visual cues on visual exploration when walking and the important role of attention in PD. Understanding these complex features will help inform intervention development

    Age-Related Differences in Multimodal Information Processing and Their Implications for Adaptive Display Design.

    Full text link
    In many data-rich, safety-critical environments, such as driving and aviation, multimodal displays (i.e., displays that present information in visual, auditory, and tactile form) are employed to support operators in dividing their attention across numerous tasks and sources of information. However, limitations of this approach are not well understood. Specifically, most research on the effectiveness of multimodal interfaces has examined the processing of only two concurrent signals in different modalities, primarily in vision and hearing. Also, nearly all studies to date have involved young participants only. The goals of this dissertation were therefore to (1) determine the extent to which people can notice and process three unrelated concurrent signals in vision, hearing and touch, (2) examine how well aging modulates this ability, and (3) develop countermeasures to overcome observed performance limitations. Adults aged 65+ years were of particular interest because they represent the fastest growing segment of the U.S. population, are known to suffer from various declines in sensory abilities, and experience difficulties with divided attention. Response times and incorrect response rates to singles, pairs, and triplets of visual, auditory, and tactile stimuli were significantly higher for older adults, compared to younger participants. In particular, elderly participants often failed to notice the tactile signal when all three cues were combined. They also frequently falsely reported the presence of a visual cue when presented with a combination of auditory and tactile cues. These performance breakdowns were observed both in the absence and presence of a concurrent visual/manual (driving) task. Also, performance on the driving task suffered the most for older adult participants and with the combined visual-auditory-tactile stimulation. Introducing a half-second delay between two stimuli significantly increased response accuracy for older adults. This work adds to the knowledge base in multimodal information processing, the perceptual and attentional abilities and limitations of the elderly, and adaptive display design. From an applied perspective, these results can inform the design of multimodal displays and enable aging drivers to cope with increasingly data-rich in-vehicle technologies. The findings are expected to generalize and thus contribute to improved overall public safety in a wide range of complex environments.PhDIndustrial and Operations EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/133203/1/bjpitts_1.pd

    Aerospace Medicine and Biology: A continuing bibliography with indexes (supplement 314)

    Get PDF
    This bibliography lists 139 reports, articles, and other documents introduced into the NASA scientific and technical information system in August, 1988

    Multimodal Human-Machine Interface For Haptic-Controlled Excavators

    Get PDF
    The goal of this research is to develop a human-excavator interface for the hapticcontrolled excavator that makes use of the multiple human sensing modalities (visual, auditory haptic), and efficiently integrates these modalities to ensure intuitive, efficient interface that is easy to learn and use, and is responsive to operator commands. Two empirical studies were conducted to investigate conflict in the haptic-controlled excavator interface and identify the level of force feedback for best operator performance
    • …
    corecore