14 research outputs found
Comparing parameterizations of pitch register and its discontinuities at prosodic boundaries for Hungarian
We examined how well prosodic boundary strength can be captured by two declination stylization methods as well as by four different representations of pitch register. In the stylization proposed by Liebermann et al. (1985) base- and topline are fitted to peaks and valleys of the pitch contour, whereas in Reichel&Mády (2013) these lines are fitted to medians below and above certain pitch percentiles. From each of the stylizations four feature pools were induced representing different aspects of register discontinuity at word boundaries: discontinuities related to the base-, mid-, and topline, as well as to the range between base- and topline. Concerning stylization the median-based fitting approach turned out to be more robust with respect to declination line crossing errors and yielded base-, topline and range-related discontinuity characteristics with higher correlations to perceived boundary strength. Concerning register representation, for the peak/valley fitting approach the base- and
topline patterns showed weaker correspondences to boundary strength than the other feature pools. We furthermore trained generalized linear regression models for boundary strength prediction on each feature pool. It turned out that neither the stylization method nor the register representation had a significant influence on the overall good prediction performance
Modeling Accentual Phrase Intonation in Slovak and Hungarian
According to Jun and Fletcher (2014), languages with fixed lexical stress towards the edge of the word often include accentual phrases (AP) as a structural prosodic unit between the Prosodic Word (PrWd) and the Intermediate Phrase (ip). APs also tend to show a stable recurrent F0 pattern in various contexts. Slovak and Hungarian both have fixed word-initial lexical stress, and we test the hypothesis that APs are consistently marked with stable F0 contours, which is a precondition for their relevance
in the intonational phonologies of the two languages. We employ linear and second-order polynomial stylizations of F0 throughout putative APs and intonation phrases (IPs) in a corpus of spontaneous utterances in Slovak and Hungarian from collaborative dialogues. The results show that these putative APs have consistent F0 contour patterns that are differentiated from the IP pattern in both languages: the Hungarian ones fall, while the Slovak ones rise before they fall
Veracity Computing from Lexical Cues and Perceived Certainty Trends
We present a data-driven method for determining the veracity of a set of
rumorous claims on social media data. Tweets from different sources pertaining
to a rumor are processed on three levels: first, factuality values are assigned
to each tweet based on four textual cue categories relevant for our journalism
use case; these amalgamate speaker support in terms of polarity and commitment
in terms of certainty and speculation. Next, the proportions of these lexical
cues are utilized as predictors for tweet certainty in a generalized linear
regression model. Subsequently, lexical cue proportions, predicted certainty,
as well as their time course characteristics are used to compute veracity for
each rumor in terms of the identity of the rumor-resolving tweet and its binary
resolution value judgment. The system operates without access to
extralinguistic resources. Evaluated on the data portion for which hand-labeled
examples were available, it achieves .74 F1-score on identifying rumor
resolving tweets and .76 F1-score on predicting if a rumor is resolved as true
or false.Comment: to appear in: Proc. 2nd Workshop on Noisy User-generated Text, Osaka,
Japan, 201
Speech Recognition
Chapters in the first part of the book cover all the essential speech processing techniques for building robust, automatic speech recognition systems: the representation for speech signals and the methods for speech-features extraction, acoustic and language modeling, efficient algorithms for searching the hypothesis space, and multimodal approaches to speech recognition. The last part of the book is devoted to other speech processing applications that can use the information from automatic speech recognition for speaker identification and tracking, for prosody modeling in emotion-detection systems and in other speech processing applications that are able to operate in real-world environments, like mobile communication services and smart homes