Search CORE

578 research outputs found

Using PSILOG, a new acquisition package to update FRANCIS

Author: Maguer Jacques le
Publication venue: DEU
Publication date: 26/02/2009
Field of study

SSOAR - Social Science Open Access Repository

The limits of the Mean Opinion Score for speech synthesis evaluation

Author: Harte Naomi
King Simon
Le Maguer Sébastian
Publication venue
Publication date: 01/03/2024
Field of study

The release of WaveNet and Tacotron has forever transformed the speech synthesis landscape. Thanks to these game-changing innovations, the quality of synthetic speech has reached unprecedented levels. However, to measure this leap in quality, an overwhelming majority of studies still rely on the Absolute Category Rating (ACR) protocol and compare systems using its output; the Mean Opinion Score (MOS). This protocol is not without controversy, and as the current state-of-the-art synthesis systems now produce outputs remarkably close to human speech, it is now vital to determine how reliable this score is.To do so, we conducted a series of four experiments replicating and following the 2013 edition of the Blizzard Challenge. With these experiments, we asked four questions about the MOS: How stable is the MOS of a system across time? How do the scores of lower quality systems influence the MOS of higher quality systems? How does the introduction of modern technologies influence the scores of past systems? How does the MOS of modern technologies evolve in isolation?The results of our experiments are manyfold. Firstly, we verify the superiority of modern technologies in comparison to historical synthesis. Then, we show that despite its origin as an absolute category rating, MOS is a relative score. While minimal variations are observed during the replication of the 2013-EH2 task, these variations can still lead to different conclusions for the intermediate systems. Our experiments also illustrate the sensitivity of MOS to the presence/absence of lower and higher anchors. Overall, our experiments suggest that we may have reached the end of a cul-de-sac by only evaluating the overall quality with MOS. We must embark on a new road and develop different evaluation protocols better suited to the analysis of modern speech synthesis technologies

Edinburgh Research Explorer

Liaison and pronunciation learning in end-to-end text-to-speech in French

Author: Maguer Sébastien Le
Richmond Korin
Taylor Jason
Publication venue: 'International Speech Communication Association'
Publication date: 28/08/2021
Field of study

Edinburgh Research Explorer

Valeria Piacentini Fiorani. « A Sasanian fleet or a maritime system? »

Author: Le Maguer-Gillon Sterenn
Publication venue: 'OpenEdition'
Publication date: 26/11/2019
Field of study

L’A. questionne la présence d’une flotte sassanide dans le golfe Persique dans la seconde moitié du VIIe s.. Pour cela, l’auteure se base sur les sources arabes des IIIe/IXe-VIIe/XIIIe siècles, en particulier les chroniques de la conquête arabe (Kutub al-Futūḥ) et une chronique rédigée en persan relatant les conquêtes plus orientales, jusqu’au Makran (Fatḥnāmah-i Sind). Interrogeant les silences et les sous-entendus de ces récits, l’auteure démontre que les eaux du golfe Persique sont toujour..

OpenEdition

Recherche d'information médicale pour le patient Impact de ressources terminologiques

Author: Claveau Vincent
Grabar Natalia
Hamon Thierry
Le Maguer Sébastien
Publication venue: HAL CCSD
Publication date: 18/03/2015
Field of study

National audienceABSTRACT. The right of patients to access their clinical health record is granted by the code of Santé Publique. Yet, this content remain difficult to understand. We propose an experience, in which we use queries defined by patients in order to find relevant documents. We utilise the Indri search engine, based on statistical language modeling and semantic resources. We stress the point related to the terminological variation (e.g. synonyms, abbreviations) to make the link between expert and patient languages. Various combinations of resources and Indri settings are explored, mostly based on query expansion. Our system shows up to 0.7660 P@10 and up to 0.6793 [email protected]ÉSUMÉ. Le droit d'accès au dossier clinique par les patients est inscrit dans le code de Santé Publique. Cependant, ce contenu reste difficile à comprendre. Nous proposons une expérience, où les requêtes des patients sont utilisées pour retrouver les documents pertinents. Nous util-isons le moteur de recherche Indri, basé sur le modèle statistique de la langue, et des ressources sémantiques. L'accent est mis sur la variation terminologique (e.g. synonymes, abréviations) pour faire le lien entre la langue des experts et des patients. Différentes combinaisons de ressources et du paramétrage de Indri sont testées, essentiellement à travers l'expansion des requêtes. Notre système montre jusqu'à 0,7660 de P@10 et 0,6793 de NDCG@10

HAL-CentraleSupelec

INRIA a CCSD electronic archive server

HAL-Paris 13

Hal-Diderot

HAL-Rennes 1

Claire Hardy-Guilbert, Hélène Renel, Axelle Rougeulle, Eric Vallet (éds.). Sur les chemins d’Onagre : Histoire et archéologie orientales : Hommage à Monik Kervran

Author: Maguer-Gillon Sterenn Le
Publication venue: 'OpenEdition'
Publication date: 01/07/2020
Field of study

Sur les chemins d’Onagre rend hommage à Monik Kervran, archéologue et directrice de recherche émérite au CNRS, qui a largement contribué à l’avancement des connaissances de l’histoire de l’Islam oriental, en particulier en Iran et dans le Golfe. Les dix-neuf articles réunis témoignent de la diversité des thèmes de recherches sur lesquels elle a travaillé. En effet, ils traitent aussi bien de l’époque sassanide qu’islamique, et abordent des sujets variés tels que l’architecture, la céramique, ..

OpenEdition

Phonetic accommodation in interaction with a virtual language learning tutor: A Wizard-of-Oz study

Author: Gessinger Iona
Le Maguer Sébastien
Möbius Bernd
Raveh Eran
Steiner Ingmar
Publication venue: Saarländische Universitäts- und Landesbibliothek
Publication date: 01/01/2021
Field of study

We present a Wizard-of-Oz experiment examining phonetic accommodation of human interlocutors in the context of human-computer interaction. Forty-two native speakers of German engaged in dynamic spoken interaction with a simulated virtual tutor for learning the German language called Mirabella. Mirabella was controlled by the experimenter and used either natural or hidden Markov model-based synthetic speech to communicate with the participants. In the course of four tasks, the participants’ accommodating behavior with respect to wh-question realization and allophonic variation in German was tested. The participants converged to Mirabella with respect to modified wh-question intonation, i.e., rising F0 contour and nuclear pitch accent on the interrogative pronoun, and the allophonic contrast [ɪç] vs. [ɪk] occurring in the word ending -ig. They did not accommodate to the allophonic contrast [ɛː] vs. [eː] as a realization of the long vowel -ä-. The results did not differ between the experimental groups that communicated with either the natural or the synthetic speech version of Mirabella. Testing the influence of the “Big Five” personality traits on the accommodating behavior revealed a tendency for neuroticism to influence the convergence of question intonation. On the level of individual speakers, we found considerable variation with respect to the degree and direction of accommodation. We conclude that phonetic accommodation on the level of local prosody and segmental pronunciation occurs in users of spoken dialog systems, which could be exploited in the context of computer-assisted language learning

Universaar

Acronym

Évaluation expérimentale d'un système statistique de synthèse de la parole, HTS, pour la langue française

Author: BARBOT Nelly
BOËFFARD Olivier
LE MAGUER Sébastien
Publication venue
Publication date: 01/01/2013
Field of study

Les travaux présentés dans cette thèse se situent dans le cadre de la synthèse de la parole à partir du texte et, plus précisément, dans le cadre de la synthèse paramétrique utilisant des règles statistiques. Nous nous intéressons à l'influence des descripteurs linguistiques utilisés pour caractériser un signal de parole sur la modélisation effectuée dans le système de synthèse statistique HTS. Pour cela, deux méthodologies d'évaluation objective sont présentées. La première repose sur une modélisation de l'espace acoustique, généré par HTS par des mélanges gaussiens (GMM). En utilisant ensuite un ensemble de signaux de parole de référence, il est possible de comparer les GMM entre eux et ainsi les espaces acoustiques générés par les différentes configurations de HTS. La seconde méthodologie proposée repose sur le calcul de distances entre trames acoustiques appariées pour pouvoir évaluer la modélisation effectuée par HTS de manière plus locale. Cette seconde méthodologie permet de compléter les diverses analyses en contrôlant notamment les ensembles de données générées et évaluées. Les résultats obtenus selon ces deux méthodologies, et confirmés par des évaluations subjectives, indiquent que l'utilisation d'un ensemble complexe de descripteurs linguistiques n'aboutit pas nécessairement à une meilleure modélisation et peut s'avérer contre-productif sur la qualité du signal de synthèse produit.The work presented in this thesis is about TTS speech synthesis and, more particularly, about statistical speech synthesis for French. We present an analysis on the impact of the linguistic contextual factors on the synthesis achieved by the HTS statistical speech synthesis system. To conduct the experiments, two objective evaluation protocols are proposed. The first one uses Gaussian mixture models (GMM) to represent the acoustical space produced by HTS according to a contextual feature set. By using a constant reference set of natural speech stimuli, GMM can be compared between themselves and consequently acoustic spaces generated by HTS. The second objective evaluation that we propose is based on pairwise distances between natural speech and synthetic speech generated by HTS. Results obtained by both protocols, and confirmed by subjective evaluations, show that using a large set of contextual factors does not necessarily improve the modeling and could be counter-productive on the speech quality.RENNES1-Bibl. électronique (352382106) / SudocSudocFranceF

OpenGrey Repository

RePaLi participation to CLEF eHealth IR challenge 2014: leveraging term variation

Author: Claveau Vincent
Grabar Natalia
Hamon Thierry
Le Maguer Sébastien
Publication venue: HAL CCSD
Publication date: 15/09/2014
Field of study

International audienceThis paper describes the participation of RePaLi, a team composed with members of IRISA, LIMSI and STL, to the biomedical information retrieval challenge proposed in the framework of CLEF eHealth. For this first participation, our approach relies on a state-of-the-art IR system called Indri, based on statistical language modeling, and on semantic resources. The purpose of semantic resources and methods is to manage the term variation such as synonyms, morpho-syntactic variants, abbreviation or nested terms. Different combinations of resources and Indri settings are explored, mostly based on query expansion. For the runs submitted, our system shows up to 67.40 p@10 and up to 67.93 NDCG@10

HAL-CentraleSupelec

INRIA a CCSD electronic archive server

HAL-Inserm

HAL-Rennes 1