9 research outputs found

    Comparing human and automatic speech recognition in a perceptual restoration experiment

    Get PDF
    Speech that has been distorted by introducing spectral or temporal gaps is still perceived as continuous and complete by human listeners, so long as the gaps are filled with additive noise of sufficient intensity. When such perceptual restoration occurs, the speech is also more intelligible compared to the case in which noise has not been added in the gaps. This observation has motivated so-called 'missing data' systems for automatic speech recognition (ASR), but there have been few attempts to determine whether such systems are a good model of perceptual restoration in human listeners. Accordingly, the current paper evaluates missing data ASR in a perceptual restoration task. We evaluated two systems that use a new approach to bounded marginalisation in the cepstral domain, and a bounded conditional mean imputation method. Both methods model available speech information as a clean-speech posterior distribution that is subsequently passed to an ASR system. The proposed missing data ASR systems were evaluated using distorted speech, in which spectro-temporal gaps were optionally filled with additive noise. Speech recognition performance of the proposed systems was compared against a baseline ASR system, and with human speech recognition performance on the same task. We conclude that missing data methods improve speech recognition performance in a manner that is consistent with perceptual restoration in human listeners

    Components of Auditory Closure

    Get PDF
    Auditory closure (AC) is an aspect of auditory processing that is crucial for understanding speech in background noise. It is a set of abilities that allows listeners to understand speech in the absence of important information, both spectral and temporal. AC is evaluated using monaural low-redundancy speech tasks: low-pass filtered words (LPFW), time-compressed words (TCW), and words-in-noise (WiN). Although not previously used, phonemic restoration with words (PhRW) is also a speech task that has been proposed as a measure of AC. In the present study, four tasks of AC, that are listed above, were used to evaluate AC skills in 50 adult females with normal hearing. Using pair-wise correlations, there were no significant relationships among LPFW, TCW, and WiN. As a result, these three tasks were considered to be independent components of AC that represented the AC abilities of spectral reconstruction, temporal resolution, and auditory induction, respectively. Multiple linear regression analysis with LFPW, TCW, and WiN as variables revealed that PhRW is accomplished using temporal resolution. The findings of this study show that no single task of AC is representative of the entire process and that further research is warranted to more completely define the skills that make AC possible

    Age Effects on Perceptual Organization of Speech in Realistic Environments

    Get PDF
    Communication often occurs in environments where background sounds fluctuate and mask portions of the intended message. Listeners use envelope and periodicity cues to group together audible glimpses of speech and fill in missing information. When the background contains other talkers, listeners also use focused attention to select the appropriate target talker and ignore competing talkers. Whereas older adults are known to experience significantly more difficulty with these challenging tasks than younger adults, the sources of these difficulties remain unclear. In this project, three related experiments explored the effects of aging on several aspects of speech understanding in realistic listening environments. Experiments 1 and 2 determined the extent to which aging affects the benefit of envelope and periodicity cues for recognition of short glimpses of speech, phonemic restoration of missing speech segments, and/or segregation of glimpses with a competing talker. Experiment 3 investigated effects of age on the ability to focus attention on an expected voice in a two-talker environment. Twenty younger adults and 20 older adults with normal hearing participated in all three experiments and also completed a battery of cognitive measures to examine contributions from specific cognitive abilities to speech recognition. Keyword recognition and cognitive data were analyzed with an item-level logistic regression based on a generalized linear mixed model. Results indicated that older adults were poorer than younger adults at glimpsing short segments of speech but were able use envelope and periodicity cues to facilitate phonemic restoration and speech segregation. Whereas older adults performed poorer than younger adults overall, these groups did not differ in their ability to focus attention on an expected voice. Across all three experiments, older adults were poorer than younger adults at recognizing speech from a female talker both in quiet and with a competing talker. Results of cognitive tasks indicated that faster processing speed and better visual-linguistic closure were predictive of better speech understanding. Taken together these results suggest that age-related declines in speech recognition may be partially explained by difficulty grouping short glimpses of speech into a coherent message, which may be particularly difficult for older adults when the talker is female

    On countermeasures of worm attacks over the Internet

    Get PDF
    Worm attacks have always been considered dangerous threats to the Internet since they can infect a large number of computers and consequently cause large-scale service disruptions and damage. Thus, research on modeling worm attacks, and defenses against them, have become vital to the field of computer and network security. This dissertation intends to systematically study two classes of countermeasures against worm attacks, known as traffic-based countermeasure and non-traffic based countermeasure. Traffic-based countermeasures are those whose means are limited to monitoring, collecting, and analyzing the traffic generated by worm attacks. Non-traffic based countermeasures do not have such limitations. For the traffic-based countermeasures, we first consider the worm attack that adopts feedback loop-control mechanisms which make its overall propagation traffic behavior similar to background non-worm traffic and circumvent the detection. We also develop a novel spectrumbased scheme to achieve highly effective detection performance against such attacks. We then consider worm attacks that perform probing traffic in a stealthy manner to obtain the location infrastructure of a defense system and introduce an information-theoretic based framework to obtain the limitations of such attacks and develop corresponding countermeasures. For the non-traffic based countermeasures, we first consider new unseen worm attacks and develop the countermeasure based on mining the dynamic signature of worm programs’ run-time execution. We then consider a generic worm attack that dynamically changes its propagation patterns and develops integrated countermeasures based on the attacker’s contradicted objectives. Lastly, we consider the real-world system setting with multiple incoming worm attacks that collaborate by sharing the history of their interactions with the defender and develop a generic countermeasure based on establishing the defender’s reputation of toughness in its repeated interactions with multiple incoming attackers to optimize the long-term defense performance. This dissertation research has broad impacts on Internet worm research since this work is fundamental, practical and extensible. Our developed framework can be used by researchers to understand key features of other forms of new worm attacks and develop countermeasures against them

    The perceptual restoration of music in young children

    Get PDF
    EThOS - Electronic Theses Online ServiceGBUnited Kingdo
    corecore