47 research outputs found

    NORMALIZACIJA VOKALSKIH FORMANATA U HRVATSKOME I SRPSKOME JEZIKU

    Get PDF
    The aim of this study was to compare the results of a traditional formant analysis of vowels with the results of normalization systems on the example of Croatian and Serbian speech. Male native speakers of Croatian and Serbian were used for this study (N=92). Traditional results of formant analyses express differences among analysed groups of speakers caused by linguistic, sociolinguistic, but also physiological factors. Considering that the values of formant vowels are influenced by many factors, including idiosyncratic physiological characteristics of the vocal tract, normalization approaches remove those variables among speakers that are caused by mutual physiological differences. Therefore, the dialectal, inter-linguistic and/or sociolinguistic differences among speakers whose speech is being analysed are isolated in a scientifically more objective way. The results of this study have shown that formant values are more grouped together and centralized (especially in vowels [a] and [i]), than in non-normalized results within each language individually. This contrastive analysis has shown that in Croatian [i], [o] and [u] are more closed and frontal, the vowel [a] is more closed and back, and the vowel [e] is more open and front, in relation to the vowels in Serbian. This study exemplifies the advantage of normalization systems in the interpretation of acoustic results.&nbspSvrha je ovoga istraživanja bila usporediti rezultate tradicionalne formantske analize vokala s rezultatima normalizacijskih sustava, na primjeru hrvatskoga i srpskoga govora. Za potrebe rada analizirani su muški izvorni govornici hrvatskoga i srpskoga jezika (N=92). S obzirom na to da na vrijednosti formanata vokala utječu brojni faktori, između kojih i idiosinkratičke fiziološke karakteristike govornika, normalizacijom se uklanja varijabilnost među govornicima uzrokovana njihovom fiziološkom različitošću. Normalizacijom vrijednosti formanata utvrđen je viši stupanj centralizacije svih vokala obaju jezika u usporedbi s ne-normaliziranim vrijednostima formanata, dok je kontrastivna analiza među jezicima ukazala na razlike u obilježjima prednjosti i stražnjosti te otvorenosti i zatvorenosti kod svih vokala.

    A Comparative Study of Spectral Peaks Versus Global Spectral Shape as Invariant Acoustic Cues for Vowels

    Get PDF
    The primary objective of this study was to compare two sets of vowel spectral features, formants and global spectral shape parameters, as invariant acoustic cues to vowel identity. Both automatic vowel recognition experiments and perceptual experiments were performed to evaluate these two feature sets. First, these features were compared using the static spectrum sampled in the middle of each steady-state vowel versus features based on dynamic spectra. Second, the role of dynamic and contextual information was investigated in terms of improvements in automatic vowel classification rates. Third, several speaker normalizing methods were examined for each of the feature sets. Finally, perceptual experiments were performed to determine whether vowel perception is more correlated with formants or global spectral shape. Results of the automatic vowel classification experiments indicate that global spectral shape features contain more information than do formants. For both feature sets, dynamic features are superior to static features. Spectral features spanning a time interval beginning with the start of the on-glide region of the acoustic vowel segment and ending at the end of the off-glide region of the acoustic vowel segment are required for maximum vowel recognition accuracy. Speaker normalization of both static and dynamic features can also be used to improve the automatic vowel recognition accuracy. Results of the perceptual experiments with synthesized vowel segments indicate that if formants are kept fixed, global spectral shape can, at least for some conditions, be modified such that the synthetic speech token will be perceived according to spectral shape cues rather than formant cues. This result implies that overall spectral shape may be more important perceptually than the spectral prominences represented by the formants. The results of this research contribute to a fundamental understanding of the information-encoding process in speech. The signal processing techniques used and the acoustic features found in this study can also be used to improve the preprocessing of acoustic signals in the front-end of automatic speech recognition systems

    Envelhecimento vocal: estudo acústico-articulatório das alterações de fala com a idade

    Get PDF
    Background: Although the aging process causes specific alterations in the speech organs, the knowledge about the age effects in speech production is still disperse and incomplete. Objective: To provide a broader view of the age-related segmental and suprasegmental speech changes in European Portuguese (EP), considering new aspects besides static acoustic features, such as dynamic and articulatory data. Method: Two databases, with speech data of Portuguese adult native speakers obtained through standardized recording and segmentation procedures, were devised: i) an acoustic database containing all EP oral vowels produced in similar context (reading speech), and also a sample of semispontaneous speech (image description) collected from a large sample of adults between the ages 35 and 97; ii) and another with articulatory data (ultrasound (US) tongue images synchronized with speech) for all EP oral vowels produced in similar contexts (pseudowords and isolated) collected from young ([21-35]) and older ([55-73]) adults. Results: Based on the curated databases, various aspects of the aging speech were analyzed. Acoustically, the aging speech is characterized by: 1) longer vowels (in both genders); 2) a tendency for F0 to decrease in women and slightly increase in men; 3) lower vowel formant frequencies in females; 4) a significant reduction of the vowel acoustic space in men; 5) vowels with higher trajectory slope of F1 (in both genders); 6) shorter descriptions with higher pause time for males; 7) faster speech and articulation rate for females; and 8) lower HNR for females in semi-spontaneous speech. In addition, the total speech duration decrease is associated to non-severe depression symptoms and age. Older adults tended to present more depressive symptoms that could impact the amount of speech produced. Concerning the articulatory data, the tongue tends to be higher and more advanced with aging for almost all vowels, meaning that the vowel articulatory space tends to be higher, advanced, and bigger in older females. Conclusion: This study provides new information on aging speech for a language other than English. These results corroborate that speech changes with age and present different patterns between genders, and also suggest that speakers might develop specific articulatory adjustments with aging.Contextualização: Embora o processo de envelhecimento cause alterações específicas no sistema de produção de fala, o conhecimento sobre os efeitos da idade na fala é ainda disperso e incompleto. Objetivo: Proporcionar uma visão mais ampla das alterações segmentais e suprassegmentais da fala relacionadas com a idade no Português Europeu (PE), considerando outros aspetos, para além das características acústicas estáticas, tais como dados dinâmicos e articulatórios. Método: Foram criadas duas bases de dados, com dados de fala de adultos nativos do PE, obtidos através de procedimentos padronizados de gravação e segmentação: i) uma base de dados acústica contendo todas as vogais orais do PE em contexto semelhante (leitura de palavras), e também uma amostra de fala semiespontânea (descrição de imagem) produzidas por uma larga amostra de indivíduos entre os 35 e os 97 anos; ii) e outra com dados articulatórios (imagens de ultrassom da língua sincronizadas com o sinal acústico) de todas as vogais orais do PE produzidas em contextos semelhantes (pseudopalavras e palavras isoladas) por adultos de duas faixas etárias ([21-35] e [55-73]). Resultados: Tendo em conta as bases de dados curadas, foi analisado o efeito da idade em diversas características da fala. Acusticamente, a fala de pessoas mais velhas é caracterizada por: 1) vogais mais longas (ambos os sexos); 2) tendência para F0 diminuir nas mulheres e aumentar ligeiramente nos homens; 3) diminuição da frequência dos formantes das vogais nas mulheres; 4) redução significativa do espaço acústico das vogais nos homens; 5) vogais com maior inclinação da trajetória de F1 (ambos os sexos); 6) descrições mais curtas e com maior tempo de pausa nos homens; 7) aumento da velocidade articulatória e da velocidade de fala nas mulheres; e 8) diminuição do HNR na fala semiespontânea em mulheres. Além disso, os idosos tendem a apresentar mais sintomas depressivos que podem afetar a quantidade de fala produzida. Em relação aos dados articulatórios, a língua tende a apresentar-se mais alta e avançada em quase todas as vogais com a idade, ou seja o espaço articulatório das vogais tende a ser maior, mais alto e avançado nas mulheres mais velhas. Conclusão: Este estudo fornece novos dados sobre o efeito da idade na fala para uma língua diferente do inglês. Os resultados corroboram que a fala sofre alterações com a idade, que diferem em função do género, sugerindo ainda que os falantes podem desenvolver ajustes articulatórios específicos com a idade.Programa Doutoral em Gerontologia e Geriatri

    Vowel normalisation : an interface between acoustic and linguistic descriptions of speaker characteristics in Australian English

    No full text
    This thesis examines existing normalisation procedures against the background of a theoretical model of inter-speaker formant variability, which describes observed formant differences in three major categories: phonetic variation, non-uniform variation, and uniform variation. A new normalisation strategy based on this model is proposed which involves the removal of uniform and non-uniform components of inter-speaker variation in order to isolate phonetic variation. The nature of this nonuniformity is subject to empirical investigation. Working along the above strategy, the method adopted in this thesis is to initially acquire a phonetically stable vowel database, which is then screened for phonetic variations through a rigorous phonetic control procedure. The resulting data, now considered to be phonetically homogeneous, are used for exploring two essential domains of inter-speaker variability that contribute to the designing of a future normalisation procedure: (1) By applying uniform transformations using a variety of published scaling parameters, the most effective uniform scaling parameters are identified. (2) Non-uniform inter-speaker variation patterns are analysed and compared with the published results of Fant (1975). A major discovery is that non-uniform inter-speaker variation patterns obtained from phonetically controlled data are grossly different from those observed by Fant. The present database comprises 594 vowels in the /h_d/ word context (11 phonemic monophthongs x 9 speakers x 6 repetitions), and the speakers include 4 adult females, 3 adult males and 2 children (male)

    Modelling phonologization: vowel reduction and epenthesis in Lunigiana dialects

    Get PDF
    Building upon wave-theoretic assumptions, this dissertation provides a formal description of the relationship between diatopic/diachronic micro-variation and phonologization. In particular, an analysis is performed of the phonetic/phonological properties of unstressed vowel reduction and vowel insertion in two Northern Italian dialects: Carrarese and Pontremolese. These dialects are argued to represent two frozen stages of these processes’ diffusion, Carrarese representing the diachronic stage Pontremolese has already gone through. Indeed, Pontremolese displays non-etymological vocoids that show the phonetic and phonological characteristics of epenthetic vowels and that, crucially, can be considered the phonologized correlates of Carrarese’s intrusive vocoids. These, in turn, should be rather considered articulatory/perceptually driven vowel-like releases. A formal account of this diatopic, diachronic and grammatical relationship is given that supports a modular grammar architecture, in which phonetics and phonology constitute two autonomous modules. Within such an architecture, the lateral forces (government and licensing) developed by standard Government Phonology are translated into violable constraints and inserted in a BiPhon grammar. In this optimality-theoretic grammar, the phonetics-phonology interface is managed by a set of cue constraints that map acoustic dimensions (formant structures) onto phonological primitives (elements). Furthermore, to integrate morphological information in the phonological forms, the Coloured Containment Theory is resorted to.Language Use in Past and Presen

    Modelling phonologization: vowel reduction and epenthesis in Lunigiana dialects

    Get PDF
    Within a linguistic continuum, the further from the irradiation centre, the later a language is affected by a change; the later a language is reached by a change, the milder the outcomes. Building upon these wave-theoretic assumptions, this dissertation provides a formal description of the relationship between diatopic/diachronic micro-variation and phonologization. In particular, an analysis is performed of the phonetic/phonological properties of unstressed vowel reduction and vowel insertion in two Northern Italian dialects: Carrarese and Pontremolese. These dialects are argued to represent two frozen stages of these processes’ diffusion, Carrarese representing the diachronic stage Pontremolese has already gone through. Indeed, Pontremolese displays non-etymological vocoids that show the phonetic and phonological characteristics of epenthetic vowels and that, crucially, can be considered the phonologized correlates of Carrarese’s intrusive vocoids. These, in turn, should be rather considered articulatory/perceptually driven vowel-like releases. A formal account of this diatopic, diachronic and grammatical relationship is given that supports a modular grammar architecture, in which phonetics and phonology constitute, hence, two autonomous modules. Within such an architecture, the lateral forces (government and licensing) developed by standard Government Phonology are translated into violable constraints and inserted in a BiPhon grammar. In this optimality-theoretic grammar, the phonetics-phonology interface is managed by a set of cue constraints that map acoustic dimensions (formant structures) onto phonological primitives (elements). Furthermore, to integrate morphological information in the phonological forms, the Coloured Containment Theory is resorted to. This dissertation is of relevance to anyone interested in diatopic/diachronic micro-variation, phonologization, phonological theory and Italian dialectology

    Combining research methods for an experimental study of West Central Bavarian vowels in adults and children

    Get PDF
    The overall goal of this thesis was to systematically measure defining vowel characteristics of the West Central Bavarian (WCB) dialect for an acoustically based analysis of the Bavarian vowel system and simultaneously investigate to what extent these characteristics are being preserved across generations and if there is a sound change in progress observable in which young speakers show more characteristics of Standard German (SG) than old on some Bavarian vowel attributes. In order to address these aims we conducted acoustic recordings of WCB speaking adults and WCB speaking primary school children which were then compared to each other with an apparent-time analysis. For a more accurate view of changes in progress we combined this apparent-time comparison with longitudinal data from the WCB children, obtained at annually intervals expanding over three years. The acoustic data was enhanced by articulatory data gained from ultrasound recordings of a subset of the same WCB speaking children at two timepoints with one year interval. Analyses of the acoustic data revealed both adult/child and longitudinal changes in the direction of the standard in the children’s tendency towards a merger of two open vowels and a collapse of a long/short consonant contrast, neither of which exist in SG. There was some evidence that children in comparison with adults were beginning to develop both tensity and rounding contrasts which occur in SG but not WCB. There were no observed changes to the pattern of opening and closing diphthongs which differ markedly between the two varieties. Also, within the WCB front vowel that resulted historically from /l/-vocalization and for which articulatory data from a subset of the children was put into relation with the acoustic measures no changes were observed. The general conclusion is that WCB change is most likely to occur as a consequence of exaggerating phonetic variation that already happens to be in the direction of the standard and therefore internal factors motivated by general principles of vowel change might play a more decisive role in inducing a shift than external factors like dialect contact

    Short-term accommodation of Hong Kong English speakers towards native English accents and the effect of language attitudes

    Get PDF
    Accommodation, also known as convergence, refers to a process whereby a speaker changes the way he or she speaks to be more similar to another speaker. This dissertation focuses on two themes: language attitudes and short-term accommodation. A study using the matched-guise method is conducted to examine Hong Kong people’s attitudes towards British English, American English and Hong Kong English (henceforth HKE). Results suggest that after the handover British English is still rated as the most prestigious English variety in Hong Kong. HKE is also found to have a high level of acceptance in terms of social attractiveness. For short-term accommodation, two studies are conducted to investigate the phonetic convergence of HKE speakers towards native English accents, and the effect of language attitudes on convergence. Study 2 consists of a group of HKE speakers completing separate map tasks with a Received Pronunciation speaker and a General American English speaker. Their pronunciations of the THOUGHT vowel, the PATH vowel, rhoticity, fricative /z/ and fricative /θ/ are examined before, during and after the map tasks. The results suggest that the HKE speakers produce more fricative [z] and converge on rhoticity after exposure to the native accents. However, divergence is found on the PATH vowel and fricative /θ/, and maintenance is found on the THOUGHT vowel. These findings suggest that the HKE speakers tend to converge on the linguistic features which are more salient to them. Study 3 examines the effect of language attitudes on speech convergence, and no correlation is found between language attitudes and the HKE speakers’ convergence on rhoticity. Finally, the hybrid exemplar-based model is proposed to explain the complex results of the three studies. It provides a framework for speech accommodation which covers speech perception and production, and includes social factors as important elements in the model

    Caractéristiques acoustiques des voyelles fermées tendues, relâchées et allongées en français québécois

    Get PDF
    Tableau d’honneur de la Faculté des études supérieures et postdoctorales, 2013-2014.L’objectif de cette contribution est de décrire acoustiquement les variantes tendues, relâchées et allongées des voyelles fermées /i y u/ en français québécois, qui, sous l’accent, se retrouvent respectivement en syllabe ouverte, en syllabe fermée et en syllabe fermée par une consonne allongeante. 1350 occurrences extraites de la parole de 30 locuteurs de Rouyn-Noranda, de Saguenay et de Québec ont été analysées. Leur durée a été relevée, puis la fréquence fondamentale et la fréquence centrale des trois premiers formants (F1, F2, F3) ont été estimées à 25, 50 et 75 % de cette durée. Les variantes tendues présentent le F1 le plus bas et les relâchées, le F1 le plus élevé ; les allongées se situant entre les deux. En cours d’émission, les tendues et les allongées se tendent, mais les relâchées se centralisent. Les allongées sont celles qui présentent les trajectoires les plus importantes dans un diagramme F1 / F2.This study aims to acoustically describe tense, lax and lengthened variants of close vowels /i y u/ in Quebec French which, under stress, are found in open syllable, closed syllable and syllable closed by a lengthening consonant, respectively. To do so, we analysed the speech of 30 speakers from Rouyn-Noranda, Saguenay and Quebec who produced 1350 tokens of the variants under study. Their duration have been measured then the fundamental frequency and the central frequency of the first three formants (F1, F2, F3) have been estimated at 25, 50 and 75% of this duration. Tense variants exhibit the lowest F1 values while lax variants present the highest ones; the lengthened variants taking place in between. During the emission, lengthened variants show the most important trajectories in an F1 / F2 plane

    Caractéristiques acoustiques des voyelles fermées tendues, relâchées et allongées en français québécois

    Get PDF
    L'objectif de cette contribution est de décrire acoustiquement les variantes tendues, relâchées et allongées des voyelles fermées /i y u/ en français québécois, qui, sous l'accent, se retrouvent respectivement en syllabe ouverte, en syllabe fermée et en syllabe fermée par une consonne allongeante. 1350 occurrences extraites de la parole de 30 locuteurs de Rouyn-Noranda, de Saguenay et de Québec ont été analysées. Leur durée a été relevée, puis la fréquence fondamentale et la fréquence centrale des trois premiers formants (Fi, F2, F3) ont été estimées à 25, 50 et 75 % de cette durée. Les variantes tendues présentent le Fi le plus bas et les relâchées, le Fi le plus élevé; les allongées se situant entre les deux. En cours d'émission, les tendues et les allongées se tendent, mais les relâchées se centralisent. Les allongées sont celles qui présentent les trajectoires les plus importantes dans un diagramme F1 / F2
    corecore