Search CORE

47 research outputs found

Speech vocoding for laboratory phonology

Author: Benus Stefan
Cernak Milos
Lazaridis Alexandros
Publication venue: 'Elsevier BV'
Publication date: 19/05/2015
Field of study

Using phonological speech vocoding, we propose a platform for exploring relations between phonology and speech processing, and in broader terms, for exploring relations between the abstract and physical structures of a speech signal. Our goal is to make a step towards bridging phonology and speech processing and to contribute to the program of Laboratory Phonology. We show three application examples for laboratory phonology: compositional phonological speech modelling, a comparison of phonological systems and an experimental phonological parametric text-to-speech (TTS) system. The featural representations of the following three phonological systems are considered in this work: (i) Government Phonology (GP), (ii) the Sound Pattern of English (SPE), and (iii) the extended SPE (eSPE). Comparing GP- and eSPE-based vocoded speech, we conclude that the latter achieves slightly better results than the former. However, GP - the most compact phonological speech representation - performs comparably to the systems with a higher number of phonological features. The parametric TTS based on phonological speech representation, and trained from an unlabelled audiobook in an unsupervised manner, achieves intelligibility of 85% of the state-of-the-art parametric speech synthesis. We envision that the presented approach paves the way for researchers in both fields to form meaningful hypotheses that are explicitly testable using the concepts developed and exemplified in this paper. On the one hand, laboratory phonologists might test the applied concepts of their theoretical models, and on the other hand, the speech processing community may utilize the concepts developed for the theoretical phonological models for improvements of the current state-of-the-art applications

arXiv.org e-Print Archive

Infoscience - École polytechnique fédérale de Lausanne

Gestural coordination and the distribution of English \u27geminates\u27

Author: Benus Stefan
Gafos Adamantios
Smorodinsky Iris
Publication venue: ScholarlyCommons
Publication date: 01/01/2004
Field of study

ScholarlyCommons@Penn

Recommended from our members

The Prosody of Backchannels in American English

Author: Benus Stefan
Gravano Agustin
Hirschberg Julia Bell
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/2007
Field of study

We examine prosodic and contextual factors characterizing the backchannel function of single affirmative words. Data is drawn from collaborative task-oriented dialogues between speakers of Standard American English. Despite high lexical variability, backchannels are prosodically well defined: they have higher pitch and intensity and greater pitch slope than affirmative words expressing other pragmatic functions. Additionally, we identify phrase-final rising pitch as a salient trigger for backchanneling

Columbia University Academic Commons

TamPub Julkaisuarkisto - TamPub Institutional Repository

Trepo - Institutional Repository of Tampere University

Pauses in Deceptive Speech

Author: Benus Stefan
Enos Frank
Hirschberg Julia Bell
Shriberg Elizabeth
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/2006
Field of study

We use a corpus of spontaneous interview speech to investigate the relationship between the distributional and prosodic characteristics of silent and filled pauses and the intent of an interviewee to deceive an interviewer. Our data suggest that the use of pauses correlates more with truthful than with deceptive speech, and that prosodic features extracted from filled pauses themselves as well as features describing contextual prosodic information in the vicinity of filled pauses may facilitate the detection of deceit in speech

CiteSeerX

Columbia University Academic Commons

On the Role of Context and Prosody in the Interpretation of ‘Okay’

Author: Benus Stefan
Chavez Hector
Gravano Agustin
Hirschberg Julia Bell
Wilcox Lauren
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/2007
Field of study

We examine the effect of contextual and acoustic cues in the disambiguation of three discourse-pragmatic functions of the word okay. Results of a perception study show that contextual cues are stronger predictors of discourse function than acoustic cues. However, acoustic features capturing the pitch excursion at the right edge of okay feature prominently in disambiguation, whether other contextual cues are present or not

CiteSeerX

Columbia University Academic Commons

Modeling Accentual Phrase Intonation in Slovak and Hungarian

Author: Benus Stefan
Mády Katalin
Reichel Uwe D.
Publication venue: Ludwig-Maximilians-Universität München
Publication date: 01/01/2014
Field of study

According to Jun and Fletcher (2014), languages with fixed lexical stress towards the edge of the word often include accentual phrases (AP) as a structural prosodic unit between the Prosodic Word (PrWd) and the Intermediate Phrase (ip). APs also tend to show a stable recurrent F0 pattern in various contexts. Slovak and Hungarian both have fixed word-initial lexical stress, and we test the hypothesis that APs are consistently marked with stable F0 contours, which is a precondition for their relevance in the intonational phonologies of the two languages. We employ linear and second-order polynomial stylizations of F0 throughout putative APs and intonation phrases (IPs) in a corpus of spontaneous utterances in Slovak and Hungarian from collaborative dialogues. The results show that these putative APs have consistent F0 contour patterns that are differentiated from the IP pattern in both languages: the Hungarian ones fall, while the Slovak ones rise before they fall

Acoustic profiles for prosodic headedness and constituency

Author: Benus Stefan
Mády Katalin
Reichel , Uwe D.
Publication venue: 'International Speech Communication Association'
Publication date: 01/01/2018
Field of study

Crossref

Repository of the Academy's Library

Characterizing second language fluency with global wavelet spectrum

Author: Benus Stefan
Kallio Heini
Suni Antti
Šimko Juraj
Publication venue: Australasian Speech Science and Technology Association Inc.
Publication date: 01/01/2019
Field of study

Peer reviewe

Helsingin yliopiston digitaalinen arkisto

Harmony is myopic: Reply to walker 2010

Author: Archangeli Diana
Benus Stefan
Bloomfield Leonard
Blumenfeld Lev
Cole Jennifer
Cole Jennifer
Gafos Adamantios I
Kaun Abigail
McCarthy John J
McCarthy John J
Milligan Marianne
Nevins Andrew
Rhodes Russell
Sebeok Thomas A
Steriade Donca
Steriade Donca
Walker Rachel
Wendell Kimper
Wilson Colin
Publication venue: 'MIT Press - Journals'
Publication date: 01/03/2012
Field of study

Crossref

The University of Manchester - Institutional Repository