Search CORE

51 research outputs found

Estimating Speaking Rate by Means of Rhythmicity Parameters

Author: Heinrich Christian
Schiel Florian
Publication venue
Publication date: 01/01/2011
Field of study

In this paper we present a speech rate estimator based on so-called rhythmicity features derived from a modified version of the short-time energy envelope. To evaluate the new method, it is compared to a traditional speech rate estimator on the basis of semi-automatic segmentation. Speech material from the Alcohol Language Corpus (ALC) covering intoxicated and sober speech of different speech styles provides a statistically sound foundation to test upon. The proposed measure clearly correlates with the semi-automatically determined speech rate and seems to be robust across speech styles and speaker states

CiteSeerX

Open Access LMU

Finding the Most Uniform Changes in Vowel Polygon Caused by Psychological Stress

Author: Sigmund M.
Stanek M.
Publication venue: 'Brno University of Technology'
Publication date: 01/06/2015
Field of study

Using vowel polygons, exactly their parameters, is chosen as the criterion for achievement of differences between normal state of speaker and relevant speech under real psychological stress. All results were experimentally obtained by created software for vowel polygon analysis applied on ExamStress database. Selected 6 methods based on cross-correlation of different features were classified by the coefficient of variation and for each individual vowel polygon, the efficiency coefficient marking the most significant and uniform differences between stressed and normal speech were calculated. As the best method for observing generated differences resulted method considered mean of cross correlation values received for difference area value with vector length and angle parameter couples. Generally, best results for stress detection are achieved by vowel triangles created by /i/-/o/-/u/ and /a/-/i/-/o/ vowel triangles in formant planes containing the fifth formant F5 combined with other formants

Directory of Open Access Journals

Digital library of Brno University of Technology

A Speech Feature Vector based on its Maximum Phase Component

Author: Feely Stephen
Jackie O'Kelly
Lysaght Thomas
Timoney Joseph
Publication venue
Publication date: 26/06/2001
Field of study

This paper examines the performance of a vowel classification scheme using a new form of feature vector derived from a decomposition of the speech segment into Maximum Phase and Minimum Phase components. Justification for this approach in terms of its perceptual relevance is first made, followed by a signal processing scheme to obtain the components. The form for the feature vector is then discussed. Lastly, experimental work compares the performance of this new feature vector under a variety of distortion conditions with the contemporary popular choice of Mel-Frequency Cepstral Coefficients

MURAL - Maynooth University Research Archive Library

NUI Maynooth Eprint Archive

Maynooth University ePrints and eTheses Archive

A Speech Feature Vector based on its Maximum Phase Component

Author: Feely Stephen
Jackie O'Kelly
Lysaght Thomas
Timoney Joseph
Publication venue
Publication date: 26/06/2001
Field of study

MURAL - Maynooth University Research Archive Library

Probabilistic Analysis of Pronunciation with "MAUS"

Author: Alexiadou Artemis
Kipp Andreas
Schiel Florian
Publication venue
Publication date: 01/01/1997
Field of study

Open Access LMU

A Parallel Recurrent Neural Network for Language Modeling with POS Tags

Author: Guo Yuhang
Huang Heyan
Shi Shumin
Su Chao
Wu Hao
Publication venue: the National University (Philippines)
Publication date: 01/01/2017
Field of study

Waseda University Repository

Language Model Adaptation for Statistical Machine Translation with Structured Query Models

Author: Eck Matthias
Vogel Stephan
Zhao Bing
Publication venue: Association for Computational Linguistics
Publication date: 03/01/2024
Field of study

We explore unsupervised language model adaptation techniques for Statistical Machine Translation. The hypotheses from the machine translation output are converted into queries at different levels of representation power and used to extract similar sentences from very large monolingual text collection. Specific language models are then build from the retrieved data and interpolated with a general background model. Experiments show significant improvements when translating with these adapted language models

KITopen

Visualizing Classifier Adjacency Relations: A Case Study in Speaker Verification and Voice Anti-Spoofing

Author: Delgado Héctor
Evans Nicholas
Kinnunen Tomi
Lee Kong Aik
Nautsch Andreas
Sahidullah Md
Todisco Massimiliano
Wang Xin
Yamagishi Junichi
Publication venue
Publication date: 11/06/2021
Field of study

Whether it be for results summarization, or the analysis of classifier fusion, some means to compare different classifiers can often provide illuminating insight into their behaviour, (dis)similarity or complementarity. We propose a simple method to derive 2D representation from detection scores produced by an arbitrary set of binary classifiers in response to a common dataset. Based upon rank correlations, our method facilitates a visual comparison of classifiers with arbitrary scores and with close relation to receiver operating characteristic (ROC) and detection error trade-off (DET) analyses. While the approach is fully versatile and can be applied to any detection task, we demonstrate the method using scores produced by automatic speaker verification and voice anti-spoofing systems. The former are produced by a Gaussian mixture model system trained with VoxCeleb data whereas the latter stem from submissions to the ASVspoof 2019 challenge.Comment: Accepted to Interspeech 2021. Example code available at https://github.com/asvspoof-challenge/classifier-adjacenc

arXiv.org e-Print Archive

INRIA a CCSD electronic archive server

Optimizing spectral feature based text-Independent speaker recognition

Author: Kinnunen Tomi H.
Publication venue: University of Joensuu
Publication date
Field of study

UEF Electronic Publications

English-Latvian SMT: the challenge of translating into a free word order language

Author: Bralitis Edgar
Khalilov Maxim
Pretkalnina Lauma
Rodríguez Fonollosa José Adrián
Skadina Inguna
Publication venue
Publication date: 01/01/2010
Field of study

This paper presents a comparative study of two approaches to statistical machine translation (SMT) and their application to a task of English-to-Latvian translation, which is still an open research line in the field of automatic translation. We consider a state-of-the-art phrase-based SMT and an alternative N-gram-based SMT systems. The major differences between these two approaches lie in the distinct representations of bilingual units, which are the components of the bilingual model driving translation process and in the statistical modeling of the translation context. Latvian being a rather free word order language implies additional difficulties to the translation process. We contrast different reordering models and investigate how well they deal with the word ordering issue. Moving beyond automatic scores of translation quality that are classically presented in MT research papers, we contribute presenting a manual error analysis of MT systems output that helps to shed light on advantages and disadvantages of the SMT systems under consideration and identify the most prominent source of errors typical for both SMT systems.Postprint (published version

UPCommons. Portal del coneixement obert de la UPC

International Migration, Integration and Social Cohesion online publications