178 research outputs found

    Master of Science

    Get PDF
    thesisAlthough many studies have examined acoustic and sociolinguistic differences between male and female speech, the relationship between talker speaking style and perceived gender has not yet been explored. The present study attempts to determine whether clear speech, a style adopted by talkers who perceive some barrier to effective communication, shifts perceptions of femininity for male and female talkers

    Perceptual learning of dysarthric speech

    Get PDF
    Perceptual learning, when applied to speech, describes experience-evoked adjustments to the cognitive-perceptual processes required for recognising spoken language. It provides the theoretical basis for improved understanding of a speech signal that is initially difficult to perceive. Reduced intelligibility is a frequent and debilitating symptom of dysarthria, a speech disorder associated with neurological disease or injury. The current thesis investigated perceptual learning of dysarthric speech, by jointly considering intelligibility improvements and associated learning mechanisms for listeners familiarised with the neurologically degraded signal. Moderate hypokinetic dysarthria was employed as the test case in the three phases of this programme of research. The initial research phase established strong empirical evidence of improved recognition of dysarthric speech following a familiarisation experience. Sixty normal hearing listeners were randomly assigned to one of three groups and familiarised with passage readings under the following conditions: (1) neurologically intact speech (control) (n = 20), dysarthric speech (passive familiarisation) (n = 20), and (3) dysarthric speech coupled with written information (explicit familiarisation) (n = 20). Subsequent phrase transcription analysis revealed that the intelligibility scores of both groups familiarised with dysarthric speech were significantly higher than those of the control group. Furthermore, performance gains were superior, in both size and longevity, when the familiarisation conditions were explicit. A condition discrepancy in segmentation strategies, in which attention towards syllabic stress contrast cues increased following explicit familiarisation but decreased following passive familiarisation, indicated that performance differences were more than simply magnitude of benefit. Thus, it was speculated that the learning that occurred with passive familiarisation may be qualitatively different to that which occurred with explicit familiarisation. The second phase of the research programme followed up on the initial findings and examined whether the key variable behind the use of particular segmentation strategies was simply the presence or absence of written information during familiarisation. Forty normal hearing listeners were randomly assigned to one of two groups and were familiarised with experimental phrases under either passive (n = 20) or explicit (n = 20) learning conditions. Subsequent phrase transcription analysis revealed that regardless of condition, all listeners utilised syllabic stress contrast cues to segment speech following familiarisation with phrases that emphasised this prosodic perception cue. Furthermore, the study revealed that, in addition to familiarisation condition, intelligibility gains were dependent on the type of the familiarisation stimuli employed. Taken together, the first two research phases demonstrated that perceptual learning of dysarthric speech is influenced by the information afforded within the familiarisation procedure. The final research phase examined the role of indexical information in perceptual learning of dysarthric speech. Forty normal hearing listeners were randomly assigned to one of two groups and were familiarised with dysarthric speech via a training task that emphasised either the linguistic (word identification) (n = 20) or indexical (speaker identification) (n = 20) properties of the signal. Intelligibility gains for listeners trained to identify indexical information paralleled those achieved by listeners trained to identify linguistic information. Similarly, underlying error patterns were also comparable between the two training groups. Thus, phase three revealed that both indexical and linguistic features of the dysarthric signal are learnable, and can be used to promote subsequent processing of dysarthric speech. In summary, this thesis has demonstrated that listeners can learn to better understand neurologically degraded speech. Furthermore, it has offered insight into how the information afforded by the specific familiarisation procedure is differentially leveraged to improve perceptual performance during subsequent encounters with the dysarthric signal. Thus, this programme of research affords preliminary evidence towards the development of a theoretical framework that exploits perceptual learning for the treatment of dysarthria

    Ultra-high-speed imaging of bubbles interacting with cells and tissue

    Get PDF
    Ultrasound contrast microbubbles are exploited in molecular imaging, where bubbles are directed to target cells and where their high-scattering cross section to ultrasound allows for the detection of pathologies at a molecular level. In therapeutic applications vibrating bubbles close to cells may alter the permeability of cell membranes, and these systems are therefore highly interesting for drug and gene delivery applications using ultrasound. In a more extreme regime bubbles are driven through shock waves to sonoporate or kill cells through intense stresses or jets following inertial bubble collapse. Here, we elucidate some of the underlying mechanisms using the 25-Mfps camera Brandaris128, resolving the bubble dynamics and its interactions with cells. We quantify acoustic microstreaming around oscillating bubbles close to rigid walls and evaluate the shear stresses on nonadherent cells. In a study on the fluid dynamical interaction of cavitation bubbles with adherent cells, we find that the nonspherical collapse of bubbles is responsible for cell detachment. We also visualized the dynamics of vibrating microbubbles in contact with endothelial cells followed by fluorescent imaging of the transport of propidium iodide, used as a membrane integrity probe, into these cells showing a direct correlation between cell deformation and cell membrane permeability

    Representations of native and foreign talkers in brain and behaviour

    Get PDF
    Human listeners possess good speaker recognition abilities, and are capable of discriminating and identifying speakers from a range of spoken utterances. However, voice recognition can be enhanced when a listener is capable of understanding the speech produced by a talker. A well-established demonstration of this is known as the ā€œLanguage-Familiarityā€ Effect (LFE) for voice recognition. This effect manifests as an impairment for voice recognition in foreign language speech conditions, as contrasted with recognition of talkers who are speaking in a listenerā€™s mother tongue, and has been repeatedly demonstrated across a range of different tasks and languages. The LFE has previously been conceptualized as an analogue to the even better-known ā€œOther-Raceā€ Effect (ORE) for face recognition, where own-race faces are better remembered than other-race faces. An influential theoretical model of the ORE posits that faces are represented in a multidimensional ā€œface-spaceā€, whose dimensions are shaped by perceptual experience and code for features which are diagnostic for face individuation (Valentine, 1991). Over the course of an individualā€™s perceptual experience, these dimensions might become attuned for own-race face recognition; as a consequence, the dimensions will be sub-optimal for other-race recognition, leading to the illusion of increased similarity among different other-race faces, relative to own-race faces ā€“ what has been termed the ā€œthey-all-look-alikeā€ effect. The idea of a complementary ā€œvoice-spaceā€ has already been posited in the auditory domain, and might serve as a useful model for the LFE. Speakers might be individuated on the basis of diagnostic dimensions which might code for important voice-acoustical attributes. However, these dimensions might also be shaped according to linguistic experience, and voice individuation (and recognition) might be optimised when listeners can take advantage of both general voice acoustics and stored representations of their native language to tell speakers apart. The face-space hypothesis represents a plausible model for the ORE, and evidence for it has accrued through computational modelling and neuroimaging work. Conversely, however, at present it merely serves as a descriptive model for the LFE. In this thesis, I combine behavioural testing, and neuroimaging studies using functional Magnetic Resonance Imaging (fMRI) to probe the nature of the representations of native and foreign speakers. Chapter 1 provides a general overview of voice processing with an emphasis on voice recognition. Subsequently, I provide a review of relevant literature pertaining to the LFE, and introduce a brief comparison to the ORE for faces in the context of the Valentine (1991) similarity model, ending with a description of the aims of the thesis. In Chapter 2, I present the results of a behavioural experiment where native English and Mandarin speaking listeners rated all pairwise combinations of a series of English- and Mandarin-speaking voices. Crucially, the LFE does not appear to be dependent on full comprehension of the linguistic message, as young infants can better tell apart speakers in their native language than in a foreign language before their speech comprehension abilities are fully mature. This suggests that exposure to the sound-structure characteristic of infantsā€™ nascent mother tongue might be sufficient to enhance native language speaker discrimination, in the absence of full comprehension. Therefore, to examine a counterpart in adults, speech stimuli were subjected to time-reversal, a process which precludes lexical and semantic access but which leaves intact certain phonemic properties of the original speech signal. Both the English and Mandarin listeners rated pairs of native-language voices as sounding more dissimilar than foreign voices, suggesting that the language-specific sound-structure elements remaining in the reversed speech enabled an enhanced individuation of native voices. Next, in Chapter 3, I aimed to probe the neural basis of this enhanced individuation in an fMRI experiment which was intended to capture dissimilarities among paired cerebral responses to unintelligible native and foreign speakers. Here, I did not find a direct correlate of the behavioural effect, but did find that local patterns of response estimates in the bilateral superior temporal cortex (STC) appear to ā€œdiscriminateā€ the different language categories in both English and Mandarin listeners. Specifically, when the pairwise dissimilarity in brain responses to different speakers was collected, relatively high dissimilarity was observed for pairs consisting of a response to an English speaker and a Mandarin speaker, whereas relatively low dissimilarity was observed for pairs consisting of two English or two Mandarin speakers. In Chapter 4, I report what is, to my knowledge, the first explicit examination of the neural basis for the LFE in intelligible speech. A monolingual sample of English speakers participated in an fMRI experiment where they listened to the voices of English and Mandarin speakers. Importantly, speech stimuli in both language conditions were matched in inter-speaker acoustical variability. Combined response patterns from bilateral voice-sensitive temporal lobe regions enabled a learning algorithm to decode the identities of the voices who elicited the responses, but, crucially, only in the native speech (English) condition. Interestingly, native-language speaker decoding was also achieved from a left-hemisphere voice-sensitive region alone, but not a right-hemisphere region. This putative leftward bias might reflect a higher discriminability of native-language talkers in the brain, via an enhanced ability to individuate voices on the basis of indexical variation around stored speech-sound representations. Finally, in Chapter 5, I conclude with a general discussion of the foregoing results, their implications for an analogous conception of the LFE and ORE, and some strands of thought for future investigation

    Exploring the effects of accent on cognitive processes: behavioral and electrophysiological insights

    Get PDF
    167 p.Previous research has found that speaker accent can have an impact on a range of offline and online cognitive processes (Baus, Bas, Calabria, & Costa, 2017; McAleer, Todorov, & Belin, 2014; Stevenage, Clarke, & McNeill, 2012; Sporer, 2001). Indeed, previous studies show that there are differences in native and non-native speech processing (Lev-Ari, 2018). Processing foreign-accented speech requires the listener to adapt to an extra range of variability, suggesting that there may be an increase in the amount of attentional and cognitive resources that are needed to successfully interpret the speech signal of a foreign-accented speaker. However, less is known about the differences between processing native and dialectal accents. Is dialectal processing more similar to foreign or native speech? To address this, two theories have been proposed (Clarke & Garrett, 2004; Floccia et al, 2009). Previous studies have contributed to the plausibility of both hypotheses and importantly for the purposes of this project, previous electroencephalography experiments exploring the question have mainly used sentences as material. More studies are needed to elucidate whether foreign accent is processed uniquely from all types of native speech (both native and dialectal accents) or whether dialectal accent is treated differently from native accent, despite both being native speech variations. Accordingly, the central aim of this dissertation is to further investigate processing mechanisms of speech accent across different levels of linguistic analysis using evidence from both behavioral and electrophysiological experiments. An additional aim of this project was to look at the effects of accent on information retention. In addition to fluctuations in attentional demands, it seems that non-native accent can lead to differences in the depth of listenersĀæ memory encoding (Atkinson et al., 2005). This project further aimed to study how changing the accent of the information delivered may affect how well people remember the information received. Three experiments were carried out to investigate accent processing, results and future directions are discussed

    A COMPREHENSIVE REVIEW OF INTONATION: PSYCHOACOUSTICS MODELING OF PROSODIC PROMINENCE

    Get PDF
    Bolinger (1978:475), one of the foremost authorities on prosody of a generation ago, said that ā€œIntonation is a half-tamed savage. To understand the tamed or linguistically harnessed half of him, one has to make friends with the wild half.ā€ This review provides a brief explanation for the tamed and untamed halves of intonation. It is argued here that the pitch-centered approach that has been used for several decades is responsible for why one half of intonation remains untamed. To tame intonation completely, a holistic acoustic approach is required that takes intensity and duration as seriously as it does pitch. Speech is a three-dimensional physical entity in which all three correlates work independently and interdependently. Consequently, a methodology that addresses intonation comprehensively is more likely to yield better results. Psychoacoustics seems to be well positioned for this task. Nearly 100 years of experimentations have led to the discoveries of Just Noticeable Difference (JNDs) thresholds that can be summoned to help tame intonation completely. The framework discussed here expands the analytical resources and facilitates an optimal description of intonation. It calculates and ranks the relative functional load (RFL) of pitch, intensity, and duration, and uses the results to compute the melodicity score of utterances. The findings replicate, based on JNDs, how the naked ear perceives intonation on a four-point Likert melodicity scale

    Lexical effects in talker identification

    Get PDF
    Adult listeners more accurately identify talkers speaking a known language than a foreign language (Thompson, 1987), a phenomenon known as the language-familiarity effect (Perrachione & Wong, 2007). Two experiments explored how knowledge of a language facilitates talker identification. In Experiment 1, participants identified talkers in three conditions: (a) a foreign-language speech condition featuring unfamiliar sound patterns and no known words; (b) a nonsense speech condition featuring all the familiar sound patterns of their native language, such as familiar phonemes, prosody, and syllable structure, but no actual words; and (c) a native-language condition with all the familiar components of a language, including words. In Experiment 2, participants again identified speakers in familiar and unfamiliar languages. In both languages, listeners identified speakers in a condition in which no word was ever repeated, and in a condition featuring repeated words. The results suggest that access to familiar, meaningful spoken words confers an advantage beyond access to familiar sounds, syllables, and prosody, particularly when words are repeated. Together, Experiments 1 and 2 support integrated models of voice and language processing systems, and indicate that access to meaningful words is a crucial component of the language-familiarity effect in talker identification

    INFLUENCE OF SUPPORTIVE CONTEXT AND STIMULUS VARIABILITY ON RAPID ADAPTATION TO NON-NATIVE SPEECH

    Get PDF
    Older listeners, particularly those with age-related hearing loss, report a high level of difficulty in perception of non-native speech when queried in clinical settings. In an increasingly global society, addressing these challenges is an important component of providing auditory care and rehabilitation to this population. Prior literature shows that younger listeners can quickly adapt to both unfamiliar and challenging auditory stimuli, improving their perception over a short period of exposure. Prior work has suggested that a protocol including higher variability of the speech materials may be most beneficial for learning; variability within the stimuli may serve to provide listeners with a larger range of acoustic information to map onto higher level lexical representations. However, there is also evidence that increased acoustic variability is not beneficial for all listeners. Listeners also benefit from the presence of semantic context during speech recognition tasks. It is less clear, however, whether older listeners derive more benefit than younger listeners from supportive context; some studies find increased benefit for older listeners, while others find that the context benefit is similar in magnitude across age groups.This project comprises a series of experiments utilizing behavioral and electrophysiologic measures designed to examine the contributions of acoustic variability and semantic context in relation to speech recognition during the course of rapid adaptation to non-native English speech. Experiment 1 examined the effects of increasing stimulus variability on behavioral measures of rapid adaptation. The results of the study indicated that stimulus variability impacted overall levels of recognition, but did not affect rate of adaptation. This was confirmed in Experiment 2, which also showed that degree of semantic context influenced rate of adaptation, but not overall performance levels. In Experiment 3, younger and older normal-hearing adults showed similar rates of adaptation to a non-native talker regardless of context level, though talker accent and context level interacted to the detriment of older listenersā€™ speech recognition. When cortical responses were examined, younger and older normal-hearing listeners showed similar predictive processing effects for both native and non-native speech
    • ā€¦
    corecore