6 research outputs found

    Word informativeness and automatic pitch accent modeling

    Get PDF
    To appear in Proc. of EMNLP/VLC, 1999. In intonational phonology and speech synthesis research, it has been suggested that the relative informativeness of a word can be used to predict pitch prominence. The more information conveyed by a word, the more likely it will be accented. But there are others who express doubts about such a correlation. In this paper, we provide some empirical evidence to support the existence of such a correlation by employing two widely accepted measures of informativeness. Our experiments show that there is a positive correlation between the informativeness of a word and its pitch accent assignment. They also show that informativeness enables statistically significant improvements in pitch accent prediction. The computation of word informativeness is inexpensive and can be incorporated into speech synthesis systems easily.

    The Pitch Range of Italians and Americans. A Comparative Study

    Get PDF
    Linguistic experiments have investigated the nature of F0 span and level in cross-linguistic comparisons. However, only few studies have focused on the elaboration of a general-agreed methodology that may provide a unifying approach to the analysis of pitch range (Ladd, 1996; Patterson and Ladd, 1999; Daly and Warren, 2001; Bishop and Keating, 2010; Mennen et al. 2012). Pitch variation is used in different languages to convey different linguistic and paralinguistic meanings that may range from the expression of sentence modality to the marking of emotional and attitudinal nuances (Grice and Baumann, 2007). A number of factors have to be taken into consideration when determining the existence of measurable and reliable differences in pitch values. Daly and Warren (2001) demonstrated the importance of some independent variables such as language, age, body size, speaker sex (female vs. male), socio-cultural background, regional accents, speech task (read sentences vs. spontaneous dialogues), sentence type (questions vs. statements) and measure scales (Hertz, semitones, ERB etc.). Coherently with the model proposed by Mennen et al. (2012), my analysis of pitch range is based on the investigation of LTD (long-term distributional) and linguistic measures. LTD measures deal with the F0 distribution within a speaker’s contour (e.g. F0 minimum, F0 maximum, F0 mean, F0 median, standard deviation, F0 span) while linguistic measures are linked to specific targets within the contour, such as peaks and valleys (e.g. high and low landmarks) and preserve the temporal sequences of pitch contours. This investigation analyzed the characteristics of pitch range production and perception in English sentences uttered by Americans and Italians. Four experiments were conducted to examine different phenomena: i) the contrast between measures of F0 level and span in utterances produced by Americans and Italians (experiments 1-2); ii) the contrast between the pitch range produced by males and females in L1 and L2 (experiment 1); iii) the F0 patterns in different sentence types, that is, yes-no questions, wh-questions, and exclamations (experiment 2); iv) listeners’ evaluations of pitch span in terms of ±interesting, ±excited, ±credible, ±friendly ratings of different sentence types (experiments 3-4); v) the correlation between pitch span of the sentences and the evaluations given by American and Italian listeners (experiment 3); vi) the listeners’ evaluations of pitch span values in manipulated stimuli, whose F0 span was re-synthesized under three conditions: narrow span, original span, and wide span (experiment 4); vii) the different evaluations given to the sentences by male and female listeners. The results of this investigation supported the following generalizations. First, pitch span more than level was found to be a cue for non-nativeness, because L2 speakers of English used a narrower span, compared to the native norm. What is more, the experimental data in the production studies indicated that the mode of sentences was better captured by F0 span than level. Second, the Italian learners of English were influenced by their L1 and transferred L1 pitch range variation into their L2. The English sentences produced by the Italians had overall higher pitch levels and narrower pitch span than those produced by the Americans. In addition, the Italians used overall higher pitch levels when speaking Italian and lower levels when speaking English. Conversely, their pitch span was generally higher in English and lower in Italian. When comparing productions in English, the Italian females used higher F0 levels than the American females; vice versa, the Italian males showed slightly lower F0 levels than the American males. Third, there was a systematic relation between pitch span values and the listeners’ evaluations of the sentences. The two groups of listeners (the Americans and the Italians) rated the stimuli with larger pitch span as more interesting, exciting and credible than the stimuli with narrower pitch span. Thus, the listeners relied on the perceived pitch span to differentiate among the stimuli. Fourth, both the American and the Italian speakers were considered more friendly when the pitch span of their sentences was widened (wide span manipulation) and less friendly when the pitch span was narrowed (narrow span manipulation). This happened in all the stimuli regardless of the native language of the speakers (American vs. Italian)

    The semantic transparency of English compound nouns

    Get PDF
    What is semantic transparency, why is it important, and which factors play a role in its assessment? This work approaches these questions by investigating English compound nouns. The first part of the book gives an overview of semantic transparency in the analysis of compound nouns, discussing its role in models of morphological processing and differentiating it from related notions. After a chapter on the semantic analysis of complex nominals, it closes with a chapter on previous attempts to model semantic transparency. The second part introduces new empirical work on semantic transparency, introducing two different sets of statistical models for compound transparency. In particular, two semantic factors were explored: the semantic relations holding between compound constituents and the role of different readings of the constituents and the whole compound, operationalized in terms of meaning shifts and in terms of the distribution of specifc readings across constituent families. All semantic annotations used in the book are freely available

    The semantic transparency of English compound nouns

    Get PDF
    What is semantic transparency, why is it important, and which factors play a role in its assessment? This work approaches these questions by investigating English compound nouns. The first part of the book gives an overview of semantic transparency in the analysis of compound nouns, discussing its role in models of morphological processing and differentiating it from related notions. After a chapter on the semantic analysis of complex nominals, it closes with a chapter on previous attempts to model semantic transparency. The second part introduces new empirical work on semantic transparency, introducing two different sets of statistical models for compound transparency. In particular, two semantic factors were explored: the semantic relations holding between compound constituents and the role of different readings of the constituents and the whole compound, operationalized in terms of meaning shifts and in terms of the distribution of specifc readings across constituent families

    Information structure and the prosodic structure of English : a probabilistic relationship

    Get PDF
    This work concerns how information structure is signalled prosodically in English, that is, how prosodic prominence and phrasing are used to indicate the salience and organisation of information in relation to a discourse model. It has been standardly held that information structure is primarily signalled by the distribution of pitch accents within syntax structure, as well as intonation event type. However, we argue that these claims underestimate the importance, and richness, of metrical prosodic structure and its role in signalling information structure. We advance a new theory, that information structure is a strong constraint on the mapping of words onto metrical prosodic structure. We show that focus (kontrast) aligns with nuclear prominence, while other accents are not usually directly 'meaningful'. Information units (theme/rheme) try to align with prosodic phrases. This mapping is probabilistic, so it is also influenced by lexical and syntactic effects, as well as rhythmical constraints and other features including emphasis. Rather than being directly signalled by the prosody, the likelihood of each information structure interpretation is mediated by all these properties. We demonstrate that this theory resolves problematic facts about accent distribution in earlier accounts and makes syntactic focus projection rules unnecessary. Previous theories have claimed that contrastive accents are marked by a categorically distinct accent type to other focal accents (e.g. L+H* v H*). We show this distinction in fact involves two separate semantic properties: contrastiveness and theme/rheme status. Contrastiveness is marked by increased prominence in general. Themes are distinguished from rhemes by relative prominence, i.e. the rheme kontrast aligns with nuclear prominence at the level of phrasing that includes both theme and rheme units. In a series of production and perception experiments, we directly test our theory against previous accounts, showing that the only consistent cue to the distinction between theme and rheme nuclear accents is relative pitch height. This height difference accords with our understanding of the marking of nuclear prominence: theme peaks are only lower than rheme peaks in rheme-theme order, consistent with post-nuclear lowering; in theme-rheme order, the last of equal peaks is perceived as nuclear. The rest of the thesis involves analysis of a portion of the Switchboard corpus which we have annotated with substantial new layers of semantic (kontrast) and prosodic features, which are described. This work is an essentially novel approach to testing discourse semantics theories in speech. Using multiple regression analysis, we demonstrate distributional properties of the corpus consistent with our claims. Plain and nuclear accents are best distinguished by phrasal features, showing the strong constraint of phrase structure on the perception of prominence. Nuclear accents can be reliably predicted by semantic/syntactic features, particularly kontrast, while other accents cannot. Plain accents can only be identified well by acoustic features, showing their appearance is linked to rhythmical and low-level semantic features. We further show that kontrast is not only more likely in nuclear position, but also if a word is more structurally or acoustically prominent than expected given its syntactic/information status properties. Consistent with our claim that nuclear accents are distinctive, we show that pre-, post- and nuclear accents have different acoustic profiles; and that the acoustic correlates of increased prominence vary by accent type, i.e. pre-nuclear or nuclear. Finally, we demonstrate the efficacy of our theory compared to previous accounts using examples from the corpus
    corecore