58 research outputs found

    Towards an artificial laboratory for speech procesing

    Get PDF
    It is our belief that speech recognition algorithms can best be handled through the effective an efficient cooperation of multiple knowledge sources . For the development of various types of such algorithms, also called hybrid speech algorithms and, more generally, for speech processing, we need some advanced architectures and speech processing environments . There is also a need to manipulate speech knowledge, through the use of abstract data structures, to process data bases, and to help the modeling, simulation, and evaluation of automatic speech recognizers . To tackle these problems, we propose the new concept of an artificial laboratory for speech processing . Such a system simulates a real laboratory and allows analysis of data . It also provides a large range of computing facilities which can be used with ease to perform modeling and simulation . In this correspondence, the main concepts of the system are briefly described.Le travail présenté dans cette correspondance s'appuie sur l'idée que le développement d'algorithmes pour la reconnaissance automatique de la parole nécessite des connaissances variées qui doivent coopérer. Afin d'aider au développement de tels algorithmes, appelés algorithmes hybrides, et de façon plus générale pour le traitement automatique de la parole, des architectures avancées et des environnements intégrés sont souhaitables. Il est aussi souvent nécessaire de représenter les connaissances du domaine parole étudié par l'intermédiaire de structures abstraites, de manipuler des bases de données importantes, et de modéliser, simuler et évaluer des systèmes de reconnaissance automatique de la parol

    On the use of an auditory model and phonetic knowledge for automatic speech recognition

    Get PDF
    Including speech knowledge in automatic speech recognition (ASR) systems is a good way to improve the performance of recognizers . In this paper, we propose the ORION system which deals with speaker-independent ASR for isolated-words . ORION is a two-pass hybrid system which uses several types of knowledge . This knowledge applies to psychoacoustics, physiology and phonetics . During the first pass an auditory model, PLP (perceptually-based linear prediction analysis) combines static and dynamic features to provide a set of parameters to the dynamic programming algorithm . After this stage 98 % recognition accuracy was obtained for a digit vocabulary and 12 templates per word . The introduction of phonetic knowledge in the second pass decreases the error rate by more Chan 60 % (compared to the results of the first pass) for a confusable vocabulary (E-SET) .L'introduction de connaissances dans les systèmes de reconnaissance de parole (RAP) est un bon moyen d'améliorer les performances des systèmes actuels . Dans cet article nous proposons le système ORION dans le cadre d'une application de reconnaissance multilocuteur de mots isolés . ORION est un système hybride à deux passes intégrant plusieurs sources de connaissances : psychoacoustiques, physiologiques et phonétiques . Pendant la première passe un modèle d'analyse acoustique perceptivement fondé (PLP), combinant des caractéristiques instantanées et des caractéristiques spectrales dynamiques, est utilisé pour fournir des vecteurs de paramètres à un algorithme de programmation dynamique . A l'issue de cette première passe plus de 98 % de mots ont été correctement reconnus pour un vocabulaire de chiffres et 12 références par mot. L'introduction de connaissances phonétiques durant la deuxième passe diminue l'erreur de reconnaissance de plus de 60 % (par rapport aux résultats obtenus lors de la première passe) pour un vocabulaire de mots acoustiquement similaires (E-SET)

    A corpus of audio-visual Lombard speech with frontal and profile views

    Get PDF
    This paper presents a bi-view (front and side) audiovisual Lombard speech corpus, which is freely available for download. It contains 5400 utterances (2700 Lombard and 2700 plain reference utterances), produced by 54 talkers, with each utterance in the dataset following the same sentence format as the audiovisual “Grid” corpus [Cooke, Barker, Cunningham, and Shao (2006). J. Acoust. Soc. Am. 120(5), 2421–2424]. Analysis of this dataset confirms previous research, showing prominent acoustic, phonetic, and articulatory speech modifications in Lombard speech. In addition, gender differences are observed in the size of Lombard effect. Specifically, female talkers exhibit a greater increase in estimated vowel duration and a greater reduction in F2 frequency

    Defining freshwater as a natural resource: a framework linking water use to the area of protection natural resources

    Full text link
    © 2019, Springer-Verlag GmbH Germany, part of Springer Nature. Purpose: While many examples have shown unsustainable use of freshwater resources, existing LCIA methods for water use do not comprehensively address impacts to natural resources for future generations. This framework aims to (1) define freshwater resource as an item to protect within the Area of Protection (AoP) natural resources, (2) identify relevant impact pathways affecting freshwater resources, and (3) outline methodological choices for impact characterization model development. Methods: Considering the current scope of the AoP natural resources, the complex nature of freshwater resources and its important dimensions to safeguard safe future supply, a definition of freshwater resource is proposed, including water quality aspects. In order to clearly define what is to be protected, the freshwater resource is put in perspective through the lens of the three main safeguard subjects defined by Dewulf et al. (2015). In addition, an extensive literature review identifies a wide range of possible impact pathways to freshwater resources, establishing the link between different inventory elementary flows (water consumption, emissions, and land use) and their potential to cause long-term freshwater depletion or degradation. Results and discussion: Freshwater as a resource has a particular status in LCA resource assessment. First, it exists in the form of three types of resources: flow, fund, or stock. Then, in addition to being a resource for human economic activities (e.g., hydropower), it is above all a non-substitutable support for life that can be affected by both consumption (source function) and pollution (sink function). Therefore, both types of elementary flows (water consumption and emissions) should be linked to a damage indicator for freshwater as a resource. Land use is also identified as a potential stressor to freshwater resources by altering runoff, infiltration, and erosion processes as well as evapotranspiration. It is suggested to use the concept of recovery period to operationalize this framework: when the recovery period lasts longer than a given period of time, impacts are considered to be irreversible and fall into the concern of freshwater resources protection (i.e., affecting future generations), while short-term impacts effect the AoP ecosystem quality and human health directly. It is shown that it is relevant to include this concept in the impact assessment stage in order to discriminate the long-term from the short-term impacts, as some dynamic fate models already do. Conclusions: This framework provides a solid basis for the consistent development of future LCIA methods for freshwater resources, thereby capturing the potential long-term impacts that could warn decision makers about potential safe water supply issues in the future

    Automatic Classification and Transcription of Telephone Speech in Radio Broadcast Data

    No full text

    Les Araign\ue9es du Grand Erg occidental (Sahara alg\ue9rien)

    No full text
    Volume: 37Start Page: 966End Page: 97

    Etude acoustique du réflexe Lombard en vue de la reconnaissance de la parole produite en milieu bruité

    No full text
    Le but de cette étude est de déterminer pour le français, au niveau phonétique, les différences acoustiques entre la parole normale et la parole prononcée en milieu bruité (soumise au réflexe Lombard). Ce travai1 fait suite à une première étude que nous avons effectué sur l'anglais-américain. La même méthodologie a été adoptée pour les deux langues, et les mêmes paramètres acoustiques ont été extraits, ce qui se justifie par leur utilisation similaire dans les différents systèmes de reconnaissance de la parole étudiés pour les deux langues. Les résultats obtenus au cours de la première étude sur l'anglais-américain [1] ont montré des tendances d'évolution de nombreux paramètres acoustiques. Dans ce papier, nous présentons les résultats obtenus pour le français. Ils confirment les tendances observées dans [1], et montrent ainsi qu'il existe une forte corrélation entre les modifications acoustiques engendrées par le réflexe Lombard sur l'anglais-américain et le français

    N-best based supervised and unsupervised adaptation for native and non-native speakers in cars

    No full text
    In this paper, a new set of techniques exploiting N-best hypotheses in supervised and unsupervised adaptation are presented. These techniques combine statistics extracted from the N-best hypotheses with a weight derived from a likelihood ratio confidence measure. In the case of supervised adaptation the knowledge of the correct string is used to perform N-best based corrective adaptation. Experiments run for continuous letter recognition recorded in a car environment show that weighting N-best sequences by a likelihood ratio confidence measure provides only marginal improvement as compared to 1-best unsupervised adaptation and N-best unsupervised adaptation with equal weighting. However, an N-best based supervised corrective adaptation method weighting correct letters positively and incorrect letters negatively, resulted in a 13 % decrease of the error rate as compared with supervised adaptation. The largest improvement was obtained for nonnative speakers. 2
    • …
    corecore