Search CORE

59 research outputs found

El papel de la frecuencia del input en la adquisición de la fonología del español como L1. Estudio basado en corpus

Author: Garrote Salazar Marta
Publication venue: 'Editorial CSIC'
Publication date: 12/06/2023
Field of study

This study presents the phonological system exhibited by children (n=59) aged 3;0 to 6;0 and focuses on the role of input frequency. Using a spontaneous child speech corpus of Spanish (CHIEDE) as a data source, as well as computational processing techniques —including an automatic phonological transcriber—, data relating to the phonological level was retrieved. This resulted in a phonological inventory of Spanish-speaking children, ordered by frequency of use, which may serve as a model for research on typical and atypical child language development. Additionally, a study was carried out on the stability of the participants’ phonological systems by calculating the variability that the different age groups displayed, and outcomes were compared with other similar corpora. Results obtained from the comparison of the phonological inventory of children and adults show that there is a relationship between frequency of use in adult speech and the order of acquisition of phonemesEste estudio presenta el sistema fonológico que muestran 59 participantes de 3;0 a 6;0 años y el papel que juega la frecuencia del input. Usando como fuente un corpus de habla espontánea (CHIEDE) y técnicas de procesamiento computacional —que incluyen un transcriptor fonológico automático— se extrajeron los datos relativos al nivel fonológico, dando como resultado un inventario fonológico de niños hablantes de español. Este in-ventario, ordenado por frecuencia de uso, puede servir de modelo para la investigación en desarrollo infantil típico y atípico. Además, se realizó un estudio sobre la estabilidad del sistema fonológico de los participantes, calculando la variabilidad entre los diferentes grupos etarios y comparando resultados con otros corpus similares. Los resultados obtenidos de la comparación del inventario infantil con el adulto muestran una clara relación entre la frecuencia de uso del habla adulta y el orden de adquisición de los fonema

Loquens (E-Journal)

Dealing with linguistic mismatches for automatic speech recognition

Author: Yang Xuesong
Publication venue
Publication date: 01/05/2019
Field of study

Recent breakthroughs in automatic speech recognition (ASR) have resulted in a word error rate (WER) on par with human transcribers on the English Switchboard benchmark. However, dealing with linguistic mismatches between the training and testing data is still a significant challenge that remains unsolved. Under the monolingual environment, it is well-known that the performance of ASR systems degrades significantly when presented with the speech from speakers with different accents, dialects, and speaking styles than those encountered during system training. Under the multi-lingual environment, ASR systems trained on a source language achieve even worse performance when tested on another target language because of mismatches in terms of the number of phonemes, lexical ambiguity, and power of phonotactic constraints provided by phone-level n-grams. In order to address the issues of linguistic mismatches for current ASR systems, my dissertation investigates both knowledge-gnostic and knowledge-agnostic solutions. In the first part, classic theories relevant to acoustics and articulatory phonetics that present capability of being transferred across a dialect continuum from local dialects to another standardized language are re-visited. Experiments demonstrate the potentials that acoustic correlates in the vicinity of landmarks could help to build a bridge for dealing with mismatches across difference local or global varieties in a dialect continuum. In the second part, we design an end-to-end acoustic modeling approach based on connectionist temporal classification loss and propose to link the training of acoustics and accent altogether in a manner similar to the learning process in human speech perception. This joint model not only performed well on ASR with multiple accents but also boosted accuracies of accent identification task in comparison to separately-trained models

Being bored? Recognising natural interest by extensive audiovisual integration for real-life application

Author: Anja Höthker
Axelrod
Batur
Benedikt Hörnler
Bianchi-Berthouze
Björn Schuller
Carletta
Cohn
Cowie
Ekman
Florian Eyben
Gerhard Rigoll
Grimm
Hermansky
Hitoshi Konosu
Jürgen Gast
Kompe
Maat
Martin Wöllmer
Neumann
Nielsen
Nieschulz
Phillips
Poletti
Rabiner
Ronald Müller
Russell
Samal
Schuller
Sebe
Stiefelhagen
Tian
Ververidis
Vlasenko
Witten
Young
Publication venue: 'Elsevier BV'
Publication date: 01/11/2009
Field of study

OPUS Augsburg

Spiral - Imperial College Digital Repository

Recognising realistic emotions and affect in speech: State of the art and lessons learnt from the first challenge

Author: Altun
Anton Batliner
Armstrong
Atal
Athanaselis
Batliner
Batliner
Batliner
Bellman
Bengio
Björn Schuller
Boersma
Cheveigne
Cowie
Cowie
Daubechies
Davis
de Gelder
de Gelder
Devillers
Devillers
Dino Seppi
Erickson
Eyben
Eysenck
Fehr
Ferguson
Fernandez
Fillenbaum
Fleiss
Frick
Fukunaga
Gigerenzer
Grimm
Harnad
Hermansky
Hess
Hyvärinen
Johnstone
Jolliffe
Kharat
Kim
Lee
Lee
Lizhong
Lovins
Makhoul
Martin
Matos
Morrison
Morrison
Morrison
Nasoz
Nickerson
Noll
Nwe
Nöth
Pachet
Pantic
Pernegger
Picard
Porter
Pudil
Rabiner
Rosch
Rozeboom
Russell
Sachs
Said
Salzberg
Sato
Scherer
Schröder
Shaver
Stefan Steidl
tenBosch
Vlasenko
Witten
Wolpert
Wu
Wöllmer
Zeng
Zeng
Zeng
Zwicker
Publication venue: 'Elsevier BV'
Publication date: 01/11/2011
Field of study

More than a decade has passed since research on automatic recognition of emotion from speech has become a new field of research in line with its 'big brothers' speech and speaker recognition. This article attempts to provide a short overview on where we are today, how we got there and what this can reveal us on where to go next and how we could arrive there. In a first part, we address the basic phenomenon reflecting the last fifteen years, commenting on databases, modelling and annotation, the unit of analysis and prototypicality. We then shift to automatic processing including discussions on features, classification, robustness, evaluation, and implementation and system integration. From there we go to the first comparative challenge on emotion recognition from speech-the INTERSPEECH 2009 Emotion Challenge, organised by (part of) the authors, including the description of the Challenge's database, Sub-Challenges, participants and their approaches, the winners, and the fusion of results to the actual learnt lessons before we finally address the ever-lasting problems and future promising attempts. (C) 2011 Elsevier B.V. All rights reserved.Schuller B., Batliner A., Steidl S., Seppi D., ''Recognising realistic emotions and affect in speech: state of the art and lessons learnt from the first challenge'', Speech communication, vol. 53, no. 9-10, pp. 1062-1087, November 2011.status: publishe

OPUS Augsburg

Spiral - Imperial College Digital Repository

Veröffentlichungen und Vorträge 2003 der Mitgleider der Fakultät für Informatik

Author: Fakultät für Informatik
Publication venue: Universität Karlsruhe (TH)
Publication date: 01/01/2004
Field of study