Search CORE

14 research outputs found

Prosodic Event Recognition using Convolutional Neural Networks with Context Information

Author: Stehwien Sabrina
Vu Ngoc Thang
Publication venue
Publication date: 02/06/2017
Field of study

This paper demonstrates the potential of convolutional neural networks (CNN) for detecting and classifying prosodic events on words, specifically pitch accents and phrase boundary tones, from frame-based acoustic features. Typical approaches use not only feature representations of the word in question but also its surrounding context. We show that adding position features indicating the current word benefits the CNN. In addition, this paper discusses the generalization from a speaker-dependent modelling approach to a speaker-independent setup. The proposed method is simple and efficient and yields strong results not only in speaker-dependent but also speaker-independent cases.Comment: Interspeech 2017 4 pages, 1 figur

arXiv.org e-Print Archive

Crossref

Exploring complex vowels as phrase break correlates in a corpus of English speech with ProPOSEL, a prosody and POS English lexicon

Author: Atwell E
Brierley C
Publication venue
Publication date: 01/01/2009
Field of study

Real-world knowledge of syntax is seen as integral to the machine learning task of phrase break prediction but there is a deficiency of a priori knowledge of prosody in both rule-based and data-driven classifiers. Speech recognition has established that pauses affect vowel duration in preceding words. Based on the observation that complex vowels occur at rhythmic junctures in poetry, we run significance tests on a sample of transcribed, contemporary British English speech and find a statistically significant correlation between complex vowels and phrase breaks. The experiment depends on automatic text annotation via ProPOSEL, a prosody and part-of-speech English lexicon. Copyright © 2009 ISCA

White Rose Research Online

Leeds Beckett Repository

Gépi beszéd természetességének növelése automatikus, beszédjel alapú hangsúlycímkéző algoritmussal

Author: Beke András
Olaszy Gábor
Szaszák György
Tóth Bálint
Publication venue
Publication date: 01/01/2016
Field of study

A minél természetesebb hangzás elérése a géppel előállított beszédben napjainkban is igen fontos kutatási terület. A hangzás természetességét számos más tényező mellett a prozódia is nagyban befolyásolja, ezért alapvető követelmény egy olyan, precízen annotált korpusz megléte, amely alapján gépi tanulással pontos generatív modelleket állíthatunk elő. A korpusz kézi címkézése költséges és hosszadalmas, még a prozódiai egységekre, hangsúlyokra vonatkozóan is, ráadásul nemzetközi tapasztalatok is igazolják, hogy a szakértő címkézők ítélete is szubjektív, hiszen a különböző szakértők által előállított hangsúlyozásra vonatkozó annotációk közötti átfedés ritkán haladja meg a 80%-ot. A fentiek miatt gyakran használnak automatikus címkéző eljárásokat. A hangsúlycímkézést leggyakrabban a szöveges átirat alapján végzik el, ami azonban szerényebb pontosságot szolgáltat az emberi annotáláshoz képest. Alternatívaként jelen munkában egy beszédjel alapú hangsúlycímkéző algoritmust valósítunk meg. Az így nyert hangsúlycímkézés ellenőrzésére hat (3-3 férfi és női) HMM-TTS rendszert tanítunk, majd szubjektív lehallgatási tesztekkel (CMOS) hasonlítjuk össze a rendszereket

University of Szeged

Complex vowels as boundary correlates in a multi-speaker corpus of spontaneous English speech

Author: Atwell ES
Brierley C
Publication venue
Publication date: 01/01/2010
Field of study

We have found empirical evidence of a correlation in English between words containing complex vowels (diphthongs and triphthongs) and ‘gold-standard’ phrase break annotations in datasets as apparently different as seventeenth-century verse and a Reith lecture transcript on economics from the late twentieth-century. Spontaneous speech in the form of BBC radio news reportage from the 1980s again exhibits this statistically significant correlation for five out of ten speakers, leading to speculation as to why speakers should fall into two distinct groups. The experiment depends on the automatic annotation of text with a priori knowledge from ProPOSEL, a prosody and part-of-speech English lexicon

White Rose Research Online

Sound Pattern Matching for Automatic Prosodic Event Detection

Author: Asaei Afsaneh
Bourlard Hervé
Cernak Milos
Garner Philip
Garner Philip N.
Honnet Pierre-Edouard
Honnet Pierre-Edouard Jean Charles
Publication venue: Idiap
Publication date: 19/04/2016
Field of study

Prosody in speech is manifested by variations of loudness, exaggeration of pitch, and specific phonetic variations of prosodic segments. For example, in the stressed and unstressed syllables, there are differences in place or manner of articulation, vowels in unstressed syllables may have a more central articulation, and vowel reduction may occur when a vowel changes from a stressed to an unstressed position. In this paper, we characterize the sound patterns using phonological posteriors to capture the phonetic variations in a concise manner. The phonological posteriors quantify the posterior probabilities of the phonological features given the input speech acoustics, and they are obtained using the deep neural network (DNN) computational method. Built on the assumption that there are unique sound patterns in different prosodic segments, we devise a sound pattern matching (SPM) method based on 1-nearest neighbour classifier. In this work, we focus on automatic detection of prosodic stress placed on words, called also emphasized words. We evaluate the SPM method on English and French data with emphasized words. The word emphasis detection works very well also on cross-lingual tests, that is using a French classifier on English data, and vice versa

Infoscience - École polytechnique fédérale de Lausanne

Sound Pattern Matching for Automatic Prosodic Event Detection

Author: Asaei Afsaneh
Bourlard Hervé
Cernak Milos
Garner Philip N.
Honnet Pierre-Edouard
Publication venue
Publication date: 19/06/2016
Field of study

Infoscience - École polytechnique fédérale de Lausanne

Crossref

Acoustic identification of sentence accent in speakers with dysarthria : cross-population validation and severity related patterns

Author: De Bodt Marc
Hernandez-Diaz Huici Maria Esperanza
Kairuz Hernandez-Diaz Hector A.
Lowit Anja
Mendoza Ramos Viviana
Van den Steen Leen
Van Nuffelen Gwen
Publication venue: 'MDPI AG'
Publication date: 01/01/2021
Field of study

Dysprosody is a hallmark of dysarthria, which can affect the intelligibility and naturalness of speech. This includes sentence accent, which helps to draw listeners’ attention to important information in the message. Although some studies have investigated this feature, we currently lack properly validated automated procedures that can distinguish between subtle performance differences observed across speakers with dysarthria. This study aims for cross-population validation of a set of acoustic features that have previously been shown to correlate with sentence accent. In addition, the impact of dysarthria severity levels on sentence accent production is investigated. Two groups of adults were analysed (Dutch and English speakers). Fifty-eight participants with dysarthria and 30 healthy control participants (HCP) produced sentences with varying accent positions. All speech samples were evaluated perceptually and analysed acoustically with an algorithm that extracts ten meaningful prosodic features and allows a classification between accented and unaccented syllables based on a linear combination of these parameters. The data were statistically analysed using discriminant analysis. Within the Dutch and English dysarthric population, the algorithm correctly identified 82.8 and 91.9% of the accented target syllables, respectively, indicating that the capacity to discriminate between accented and unaccented syllables in a sentence is consistent with perceptual impressions. Moreover, different strategies for accent production across dysarthria severity levels could be demonstrated, which is an important step toward a better understanding of the nature of the deficit and the automatic classification of dysarthria severity using prosodic features

University of Strathclyde Institutional Repository

Directory of Open Access Journals

Institutional Repository Universiteit Antwerpen

NEUROSURGERY ENTHUSIASTIC WOMEN SOCIETY

Automatic Prosodic Event Detection Using Acoustic, Lexical, and Syntactic Evidence

Author: Sankaranarayanan Ananthakrishnan
Shrikanth S. Narayanan
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date
Field of study

Crossref

Euskararen bariazioa eta bariazioaren irakaskuntza - II

Author: Ensunza Aldamizetxebarria Ariane
Iglesias Chaves Aitor
Romero Andonegui Asier
Publication venue: Servicio Editorial de la Universidad del País Vasco/Euskal Herriko Unibertsitatearen Argitalpen Zerbitzua
Publication date: 01/01/2016
Field of study

144 p.Aurkibidea: - Sarrera (Iglesias, A.; Romero, A.; Ensunza, A.). - Transcribing intonation with ETI_TOBI (Elvira-García, W.). - Diatech tresna informatikoaren bertsio berria (Aurrekoetxea, G.; Iglesias, A.; Santander, G.; Usobiaga, I.). - Euskararen bariazio geo-morfologiaren azterketa (Videgain, X.; Aurrekoetxea, G.). - Subject pronoun expression in Basque: description and pedagogical implications (Sainz-Maza, L.; Rodríguez, I.). - Bariazio prosodikoaren eragina testu irakurrien ulermenean (Etxebarria, A.; Romero, A.; Gaminde, I.; Garay, U.). - Dos décadas de dialectometría entonativa (Roseano, P.). - Euskaldun berrien eta zaharren ulermena jarreren inguruan (Asensio, N.; Barrios, H.; Lázaro, A.; Sáez, I.). - Hizkuntza-aldaketaren inguruko jarrera eta prestigioaz (Ensunza, A.). - Keinuen eta bokalizazioen arteko harremanak komunikazio-funtzio goiztiarretan: ELAN softwarea erabiliz (Romero, A.; Etxebarria, A.; De Pablo, I.; Sanz, A.). - Hizkuntza aldakortasuna Larrabetzuko aditz morfologian (Etxebarria, A.; Gaminde, I.; Olalde, A.; Gaminde, U.). - Bizkaiko aditz laguntzaileen bilakaeraren azterketaz (Gaminde, I.; Romero, A.; Etxebarria, A.; Eguskiza, N.)

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Archivo Digital para la Docencia y la Investigación

by

Author: Mohammed Ehsan Hoque
Mohammed Ehsan Hoque
Pattie Maes
Rosalind W. Picard
Publication venue
Publication date
Field of study

CiteSeerX