184 research outputs found

    The phonetics of speech breathing : pauses, physiology, acoustics, and perception

    Get PDF
    Speech is made up of a continuous stream of speech sounds that is interrupted by pauses and breathing. As phoneticians are primarily interested in describing the segments of the speech stream, pauses and breathing are often neglected in phonetic studies, even though they are vital for speech. The present work adds to a more detailed view of both pausing and speech breathing with a special focus on the latter and the resulting breath noises, investigating their acoustic, physiological, and perceptual aspects. We present an overview of how a selection of corpora annotate pauses and pause-internal particles, as well as a recording setup that can be used for further studies on speech breathing. For pauses, this work emphasized their optionality and variability under different tempos, as well as the temporal composition of silence and breath noise in breath pauses. For breath noises, we first focused on acoustic and physiological characteristics: We explored alignment between the onsets and offsets of audible breath noises with the start and end of expansion of both rib cage and abdomen. Further, we found similarities between speech breath noises and aspiration phases of /k/, as well as that breath noises may be produced with a more open and slightly more front place of articulation than realizations of schwa. We found positive correlations between acoustic and physiological parameters, suggesting that when speakers inhale faster, the resulting breath noises were more intense and produced more anterior in the mouth. Inspecting the entire spectrum of speech breath noises, we showed relatively flat spectra and several weak peaks. These peaks largely overlapped with resonances reported for inhalations produced with a central vocal tract configuration. We used 3D-printed vocal tract models representing four vowels and four fricatives to simulate in- and exhalations by reversing airflow direction. We found the direction to not have a general effect for all models, but only for those with high-tongue configurations, as opposed to those that were more open. Then, we compared inhalations produced with the schwa-model to human inhalations in an attempt to approach the vocal tract configuration in speech breathing. There were some similarities, however, several complexities of human speech breathing not captured in the models complicated comparisons. In two perception studies, we investigated how much information listeners could auditorily extract from breath noises. First, we tested categorizing different breath noises into six different types, based on airflow direction and airway usage, e.g. oral inhalation. Around two thirds of all answers were correct. Second, we investigated how well breath noises could be used to discriminate between speakers and to extract coarse information on speaker characteristics, such as age (old/young) and sex (female/male). We found that listeners were able to distinguish between two breath noises coming from the same or different speakers in around two thirds of all cases. Hearing one breath noise, classification of sex was successful in around 64%, while for age it was 50%, suggesting that sex was more perceivable than age in breath noises.Deutsche Forschungsgemeinschaft (DFG) – Projektnummer 418659027: "Pause-internal phonetic particles in speech communication

    Audible aspects of speech preparation

    Get PDF
    Noises made before the acoustic onset of speech are typically ignored, yet may reveal aspects of speech production planning and be relevant to dis-course turn-taking. We quantify the nature and tim-ing of such noises, using an experimental method designed to elicit naturalistic yet controlled speech initiation data. Speakers listened to speech input, then spoke when prompt material became visible onscreen. They generally inhaled audibly before uttering a short sentence, but not before a single word. In both tasks, articulatory movements caused acoustic spikes due to weak click-like articulatory separations or stronger clicks via an ingressive, lingual airstream. The acoustic onset of the sen-tences was delayed relative to the words. This does not appear to be planned, but seems a side-effect of the longer duration of inhalation.caslpub2509pu

    The effect of study abroad experience and working memory on Chinese-English consecutive interpreting performance

    Get PDF
    This thesis investigates how study abroad experience (SAE) and working memory (WM) influence interpreting performance. Using a second language (L2) is cognitively demanding because it involves activation of a new language and the inhibition of the first language (L1). This is a general issue with all bilinguals, who have to suppress or control whichever language is currently not in use. As a special group of bilinguals, interpreters are expected to efficiently switch between the two languages by analysing input sound signals, extracting meaning, transforming, storing and retrieving the message in the input language, and then retrieving the lexicon in the target language that will be appropriate for expressing that message, (re)formulating it and finally conveying it in the target language. Moreover, some or all of these operations take place in parallel, and this multi-tasking heavily taxes interpreters’ WM. The quality of interpreting performance is known to correlate with several variables, such as language proficiency, duration of training, and interpreting experience. One factor that has received little research attention is the effect of overseas experience: Does studying in a target-language environment benefit interpreting performance? Language learners, including interpreting students, are often advised to study abroad, but the benefits of this experience, especially for interpreters, is not well understood. Taking an interdisciplinary approach, the present thesis examines the relationship between SAE, WM and interpreting performance. The main research questions examine whether students with SAE outperform those without such an experience in consecutive interpreting (CI), and how WM may be involved. The results show that students with SAE surpassed their non-SAE counterparts in word translation efficiency, L2 fluency and L2 grammatical accuracy. A similar trend was observed in study abroad participants’ overall CI performance from L2 to L1. It is worth noting that the tendency was independent of participants’ WM. Concerning WM, the results indicate that it was strongly correlated with interpreters’ bidirectional CI performance. That is, a larger WM could help achieve a better CI output in both language directions. Taken together, these findings suggest that two factors turn out to significantly influence CI performance, namely, prolonged and effective overseas study, and larger available WM resources. This research illustrates the importance of SAE and WM in interpreting, and sheds light on the relationships between language context, cognitive resources and interpreting performance. A better understanding of these relationships may have implications for future interpreting training and practice

    The production and perception of peripheral geminate/singleton coronal stop contrasts in Arabic

    Get PDF
    Gemination is typologically common word-medially but is rare at the periphery of the word (word-initially and -finally). In line with this observation, prior research on production and perception of gemination has focused primarily on medial gemination. Much less is known about the production and perception of peripheral gemination. This PhD thesis reports on comprehensive articulatory, acoustic and perceptual investigations of geminate-singleton contrasts according to the position of the contrast in the word and in the utterance. The production component of the project investigated the articulatory and acoustic features of medial and peripheral gemination of voiced and voiceless coronal stops in Modern standard Arabic and regional Arabic vernacular dialects, as produced by speakers from two disparate and geographically distant countries, Morocco and Lebanon. The perceptual experiment investigated how standard and dialectal Arabic gemination contrasts in each word position were categorised and discriminated by three groups of non-native listeners, each differing in their native language experience with gemination at different word positions. The first experiment used ultrasound and acoustic recordings to address the extent to which word-initial gemination in Moroccan and Lebanese dialectal Arabic is maintained, as well as the articulatory and acoustic variability of the contrast according to the position of the gemination contrast in the utterance (initial vs. medial) and between the two dialects. The second experiment compared the production of word-medial and -final gemination in Modern Standard Arabic as produced by Moroccan and Lebanese speakers. The aim of the perceptual experiment was to disentangle the contribution of phonological and phonetic effects of the listeners’ native languages on the categorisation and discrimination of non-lexical Moroccan gemination by three groups of non-native listeners varying in their phonological (native Lebanese group and heritage Lebanese group, for whom Moroccan is unintelligible, i.e., non-native language) and phonetic-only (native English group) experience with gemination across the three word positions. The findings in this thesis constitute important contributions about positional and dialectal effects on the production and perception of gemination contrasts, going beyond medial gemination (which was mainly included as control) and illuminating in particular the typologically rare peripheral gemination

    Combined brain language connectivity and intraoperative neurophysiologic techniques in awake craniotomy for eloquent-area brain tumor resection

    Get PDF
    Speech processing can be disturbed by primary brain tumors (PBT). Improvement of presurgical planning techniques decrease neurological morbidity associated to tumor resection during awake craniotomy. The aims of this work were: 1. To perform Diffusion Kurtosis Imaging based tractography (DKI-tract) in the detection of brain tracts involved in language; 2. To investigate which factors contribute to functional magnetic resonance imaging (fMRI) maps in predicting eloquent language regional reorganization; 3. To determine the technical aspects of accelerometric (ACC) recording of speech during surgery. DKI-tracts were streamlined using a 1.5T magnetic resonance scanner. Number of tracts and fiber pathways were compared between DKI and standard Diffusion Tensor Imaging (DTI) in healthy subjects (HS) and PBT patients. fMRI data were acquired using task-specific and resting-state paradigms during language and motor tasks. After testing intraoperative fMRI’s influence on direct cortical stimulation (DCS) number of stimuli, graph-theory measures were extracted and analyzed. Regarding speech recording, ACC signals were recorded after evaluating neck positions and filter bandwidths. To test this method, language disturbances were recorded in patients with dysphonia and after applying DCS in the inferior frontal gyrus. In contrast, HS reaction time was recorded during speech execution. DKI-tract showed increased number of arcuate fascicle tracts in PBT patients. Lower spurious tracts were identified with DKI-tract. Intraoperative fMRI and DCS showed similar stimuli in comparison with DCS alone. Increased local centrality accompanied language ipsilateral and contralateral reorganization. ACC recordings showed minor artifact contamination when placed at the suprasternal notch using a 20-200 Hz filter bandwidth. Patients with dysphonia showed decreased amplitude and frequency in comparison with HS. ACC detected an additional 11% disturbances after DCS, and a shortening of latency within the presence of a loud stimuli during speech execution. This work improved current knowledge on presurgical planning techniques based on brain structural and functional neuroimaging connectivity, and speech recordingA função linguística do ser humano pode ser afetada pela presença de tumores cerebrais (TC) A melhoria de técnicas de planeamento pré-cirurgico diminui a morbilidade neurológica iatrogénica associada ao seu tratamento cirúrgico. O objetivo deste trabalho é: 1. Testar a fiabilidade da tractografia estimada por difusor de kurtose (tract-DKI), dos feixes cerebrais envolvidos na linguagem 2. Identificar os fatores que contribuem para o mapeamento linguagem por ressonância magnética funcional (RMf) na predição da neuroplasticidade. 3. Identificar aspetos técnicos do registo da linguagem por accelerometria (ACC). A DKI-tract foi estimada após realização de RM cerebral com 1.5T. O número e percurso das fibras foi avaliado. A RMf foi adquirida durante realização de tarefas linguísticas, motoras, e em repouso. Foi testada influência dos mapas de ativação calculados por RMf, no número de estímulos realizados durante a estimulação direta cortical (EDC) intraoperatória. Medidas de conectividade foram extraídas de regiões cerebrais. A posição e filtragem de sinal ACC foram estudadas após vocalização de palavras. O sinal ACC obtido em voluntários foi comparado com doentes disfónicos, após estimulação do giro inferior frontal, e após a adição de um estímulo sonoro perturbador durante vocalização. A tract-DKI estimou um elevado número de fascículos do feixe arcuato com menos falsos negativos. Os mapas linguísticos de RMf intraoperatória, não influenciou a EDC. Medidas de centralidade aumentaram após neuroplasticidade ipsilateral e contralateral. A posição supraesternal e a filtragem de sinal ACC entre 20-200Hz demonstrou menor ruido de contaminação. Este método identificou diminuição de frequência e amplitude em doentes com disfonia, 11% de erros linguísticos adicionais após estimulação e diminuição do tempo de latência quando presente o sinal sonoro perturbador. Este trabalho promoveu a utilização de novas técnicas no planeamento pré-cirúrgico do doente com tumor cerebral e alterações da linguagem através do estudo de conectividade estrutural, funcional e registo da linguagem

    Factors in the identification and treatment of stuttering.

    Get PDF
    A large number of children with a diagnosis of stuttering will recover, often without formal treatment. This recovery pattern highlights the importance of a clear, early diagnosis and has implications for therapeutic practice. This thesis investigated three factors that could assist speech and language therapists in their diagnosis and treatment of children who stutter (CWS). Those factors were social, motor and speech skills. A pilot study investigating a fourth factor, communication attitude, is reported as an appendix. All factors were investigated from the perspective of the EXPLAN model of fluency failure. EXPLAN suggests that a combination of speech timing and phonological difficulty is an important source of fluency failures. The investigation into the social skills of CWS indicated that there is a trend for CWS to hold a lower social position to that of age matched controls. CWS were more likely to be bullied at school than their peers. The relationship between stuttering severity and social status was not significant. The motor skills study, using a battery of tests of cerebellar function (Dow & Moruzzi, 1958), indicated that CWS showed a deficit in performance on balance/posture tests at a young age and on complex movement tasks at teenage when compared to age matched controls. These differences are discussed with relation to auditory and cerebellar function. The fluency of a group of CWS was examined using phonological word analysis (Au-Yeung & Howell, 1998). Five children were producing predominantly part- word repetitions at initial assessment. Four of these children had persisted in their stutter when followed up three years later. Results suggest that information regarding motor skills and linguistic analysis of speech may be useful in the diagnosis and treatment of CWS. The results of the experimental work are discussed with relation to their theoretical and clinical significance

    IberSPEECH 2020: XI Jornadas en Tecnología del Habla and VII Iberian SLTech

    Get PDF
    IberSPEECH2020 is a two-day event, bringing together the best researchers and practitioners in speech and language technologies in Iberian languages to promote interaction and discussion. The organizing committee has planned a wide variety of scientific and social activities, including technical paper presentations, keynote lectures, presentation of projects, laboratories activities, recent PhD thesis, discussion panels, a round table, and awards to the best thesis and papers. The program of IberSPEECH2020 includes a total of 32 contributions that will be presented distributed among 5 oral sessions, a PhD session, and a projects session. To ensure the quality of all the contributions, each submitted paper was reviewed by three members of the scientific review committee. All the papers in the conference will be accessible through the International Speech Communication Association (ISCA) Online Archive. Paper selection was based on the scores and comments provided by the scientific review committee, which includes 73 researchers from different institutions (mainly from Spain and Portugal, but also from France, Germany, Brazil, Iran, Greece, Hungary, Czech Republic, Ucrania, Slovenia). Furthermore, it is confirmed to publish an extension of selected papers as a special issue of the Journal of Applied Sciences, “IberSPEECH 2020: Speech and Language Technologies for Iberian Languages”, published by MDPI with fully open access. In addition to regular paper sessions, the IberSPEECH2020 scientific program features the following activities: the ALBAYZIN evaluation challenge session.Red Española de Tecnologías del Habla. Universidad de Valladoli

    Semantic radical consistency and character transparency effects in Chinese: an ERP study

    Get PDF
    BACKGROUND: This event-related potential (ERP) study aims to investigate the representation and temporal dynamics of Chinese orthography-to-semantics mappings by simultaneously manipulating character transparency and semantic radical consistency. Character components, referred to as radicals, make up the building blocks used dur...postprin
    corecore