Search CORE

28 research outputs found

User-friendly automatic transcription of low-resource languages: Plugging ESPnet into Elpis

Author: Adams Oliver
Alexis Michaud
Aplonova Katya
Besacier Laurent
Cox Christopher
Foley Ben
Galliot Benjamin
Guillaume Séverine
Hill Nathan W.
Jacques Guillaume
Lambourne Nicholas
Sanders-Dwyer Rahasya
Wiles Janet
Wisniewski Guillaume
Publication venue: 'University of Colorado at Boulder'
Publication date: 14/12/2020
Field of study

This paper reports on progress integrating the speech recognition toolkit ESPnet into Elpis,a web front-end originally designed to provide access to the Kaldi automatic speech recognition toolkit. The goal of this work is to makeend-to-end speech recognition models avail-able to language workers via a user-friendlygraphical interface. Encouraging results are reported on (i) development of an ESPnet recipe for use in Elpis, with preliminary resultson data sets previously used for training acoustic models with the Persephone toolkit alongwith a new data set that had not previously been used in speech recognition, and (ii) in-corporating ESPnet into Elpis along with UIe nhancements and a CUDA-supported Docker file

arXiv.org e-Print Archive

Hal - Université Grenoble Alpes

HAL Descartes

SOAS Research Online

Ouvrir aux linguistes « de terrain » un accès à la transcription automatique

Author: Aplonova Katya
Besacier Laurent
Galliot Benjamin
Guillaume Séverine
Jacques Guillaume
Michaud Alexis
Wisniewski Guillaume
Publication venue: 'INIST-CNRS'
Publication date: 01/01/2020
Field of study

Le traitement automatique de la parole (TAP) commence désormais à réaliser son fort potentiel pour les tâches urgentes de description de la diversité linguistique mondiale (en déclin rapide). L’objectif du travail décrit ici consiste à mettre à la portée des praticiens de la linguistique « de terrain » (linguistes et collaborateurs) des outils de transcription automatique à la pointe des avancées technologiques. Une interface graphique conviviale, Elpis, donne accès à Kaldi et ESPnet. Les résultats sont particulièrement encourageants. D’une part, la mise au point d’une recette ESPnet à utiliser dans Elpis donne d’excellents résultats, aussi bien sur deux jeux de données précédemment utilisés pour entraîner des modèles acoustiques avec la boîte à outils Persephone qu’avec un nouveau jeu de données (langue japhug). D’autre part, l’incorporation d’ESPnet dans Elpis s’accompagne d’améliorations de l’interface utilisateur, d’une installation facilitée par conteneurisation (Docker), ainsi que de l’utilisation de processeurs graphiques (CUDA), ce qui accélère l’entraînement des modèles

Hal - Université Grenoble Alpes

Intégration d'un système de reconnaissance neuronale des phonèmes et d'un modèle de langue simple : une chaîne de traitement pour les scénarios à faibles ressources

Author: Fily Maxime
Galliot Benjamin
Guillaume Séverine
Jacques Guillaume
Michaud Alexis
Nguyễn Minh-Châu
Wisniewski Guillaume
Publication venue: HAL CCSD
Publication date: 31/03/2022
Field of study

International audienceRecently, several works have shown that fine-tuning a multilingual model of speech representation (typically XLS-R) with very small amounts of annotated data allows for the development of phonemic transcription systems of sufficient quality to help field linguists in their efforts to document the languages of the world. In this work, we explain how the quality of these systems can be improved by a very simple method, namely integrating them with a language model. Our experiments on an endangered language, Japhug (Trans-Himalayan/Tibeto-Burman), show that this approach can significantly reduce the WER, reaching the stage of automatic recognition of entire words.Des travaux récents montrent que la spécialisation (*fine-tuning*) d'un modèle multilingue de représentation de la parole (tel que XLS-R) au moyen de très petites quantités de données annotées permet d'obtenir des systèmes de transcription phonémique de qualité suffisante pour être utile aux linguistes de terrain dans leur entreprise de documentation des langues du monde. Dans ce travail, nous exposons une méthode très simple qui permet d'améliorer la qualité de ces systèmes : leur intégration avec un modèle de langue. Nos expériences sur une langue menacée, le japhug (famille trans-himalayenne/tibéto-birmane), montrent que cette approche peut réduire significativement le taux d'erreur sur les mots (WER: *Word Error Rate*), et mener au stade de la reconnaissance automatique de mots entiers

Hal - Université Grenoble Alpes

HAL Descartes

Vers des ressources électroniques interconnectées : Lexica, les dictionnaires de la collection Pangloss

Author: Bonnet Rémy
Buret Céline,
François Alexandre
Galliot Benjamin,
Guillaume Séverine
Jacques Guillaume
Lahaussois Aimée
Michailovsky Boyd
Michaud Alexis
Publication venue: HAL CCSD
Publication date: 04/07/2017
Field of study

International audienceLa présente communication expose l’état d’avancement de réalisation de dictionnaires en ligne, étape dans l’entreprise de long terme qui consiste à tirer parti des nouvelles technologies pour relier entre elles les réalisations des linguistes dits "de terrain": grammaires, dictionnaires, et recueils de textes. Demain, dictionnaires et grammaires pourront non seulement être interconnectés, mais aussi liés aux textes qui forment le cœur des données linguistiques, ainsi qu’aux enregistrements audio et vidéo de parole spontanée. Plus que de fixer une langue au moyen de l’imprimé, il s’agit désormais de l’offrir à des modes nouveaux de navigation, en exploitant tout le potentiel de corpus en ligne, y compris par des traitements statistiques

Hal - Université Grenoble Alpes

Hal-Diderot

Deux corpus audio transcrits de langues rares (japhug et na) normalisés en vue d'expériences en traitement du signal

Author: Besacier Laurent
Fily Maxime
Galliot Benjamin
Guillaume Séverine
Jacques Guillaume
Michaud Alexis
Nguyễn Minh-Châu
Rossato Solange
Wisniewski Guillaume
Publication venue: HAL CCSD
Publication date: 06/12/2021
Field of study

International audienceTwo audio corpora of minority languages of China (Japhug and Na), with transcriptions, are proposed as reference data sets for experiments in Natural Language Processing. The data, collected and transcribed in the course of immersion fieldwork, amount to a total of 1,907 minutes in Japhug and 209 minutes in Na. By making them available in an easily accessible and usable form, we hope to facilitate the development and deployment of state-of-the-art NLP tools for the full range of human languages. We present a tool for assembling datasets from the Pangloss Collection (an open archive) in a way that ensures full reproducibility of experiments conducted on these data.Deux corpus audio transcrits de langues « rares » (langues minoritaires de Chine : japhug et na) sont proposés comme corpus de référence pour des expériences en traitement automatique des langues. Les données, collectées et transcrites au fil d'enquêtes de terrain en immersion, s'élèvent à un total de 1907 minutes d'audio transcrit en japhug et de 209 minutes en na. Nous décrivons les traitements effectués pour les mettre à disposition sous une forme aisément accessible et utilisable, et présentons un outil qui permet d'assembler divers jeux de données de la collection Pangloss (archive ouverte de langues rares) en assurant la reproductibilité des expériences menées sur ces données

Hal - Université Grenoble Alpes

Spécialisation de modèles neuronaux pour la transcription phonémique : premiers pas vers la reconnaissance de mots pour les langues rares

Author: Fily Maxime
Galliot Benjamin
Guillaume Séverine
Jacques Guillaume
Macaire Cécile
Michaud Alexis
Nguyễn Minh-Châu
Rossato Solange
Wisniewski Guillaume
Publication venue: HAL CCSD
Publication date: 06/12/2021
Field of study

International audienceWe describe the latest results we have obtained in the development of NLP (Natural Language Processing) tools to reduce the transcription and annotation workload of field linguists, as part of workflows to document and describe the world's languages. We show how a new deep learning approach based on the fine-tuning of a generic representation model allows to significantly improve the quality of automatic phonemic transcription, and, more significantly, to take a first step towards automatic word recognition for low-resource languages.Nous décrivons les résultats les plus récents que nous avons obtenus dans le cadre du développement d'outils de Traitement Automatique des Langues (TAL) pour réduire l'effort de transcription et d'annotation que doivent fournir les linguistes « de terrain » au fil de leur travail de documentation et description de langues rares. En particulier, nous montrons comment une nouvelle approche neuronale fondée sur la spécialisation d'un modèle de représentation générique permet d'améliorer significativement la qualité de la transcription phonémique automatique, et surtout d'envisager la reconnaissance automatique de mots, approchant ainsi du stade de la reconnaissance automatique de la parole au sens plein du terme

Hal - Université Grenoble Alpes

Fine-tuning pre-trained models for Automatic Speech Recognition: experiments on a fieldwork corpus of Japhug (Trans-Himalayan family)

Author: Coavoux Maximin
Fily Maxime
Galliot Benjamin
Guillaume Séverine
Jacques Guillaume
Macaire Cécile
Michaud Alexis
Nguyễn Minh-Châu
Rossato Solange
Wisniewski Guillaume
Publication venue: HAL CCSD
Publication date: 26/05/2022
Field of study

International audienceThis is a report on results obtained in the development of speech recognition tools intended to support linguistic documentation efforts. The test case is an extensive fieldwork corpus of Japhug, an endangered language of the Trans-Himalayan (Sino-Tibetan) family. The goal is to reduce the transcription workload of field linguists. The method used is a deep learning approach based on the language-specific tuning of a generic pre-trained representation model, XLS-R, using a Transformer architecture. We note difficulties in implementation, in terms of learning stability. But this approach brings significant improvements nonetheless. The quality of phonemic transcription is improved over earlier experiments; and most significantly, the new approach allows for reaching the stage of automatic word recognition. Subjective evaluation of the tool by the author of the training data confirms the usefulness of this approach

Hal - Université Grenoble Alpes

HAL Descartes

Fine-tuning pre-trained models for Automatic Speech Recognition: experiments on a fieldwork corpus of Japhug (Trans-Himalayan family)

Author: Coavoux Maximin
Fily Maxime
Galliot Benjamin
Guillaume Séverine
Jacques Guillaume
Macaire Cécile
Michaud Alexis
Nguyễn Minh-Châu
Rossato Solange
Wisniewski Guillaume
Publication venue: HAL CCSD
Publication date: 26/05/2022
Field of study

Hal - Université Grenoble Alpes

Les modèles pré-entraînés à l'épreuve des langues rares : expériences de reconnaissance de mots sur la langue japhug (sino-tibétain)

Author: Coavoux Maximin
Fily Maxime
Galliot Benjamin
Guillaume Séverine
Jacques Guillaume
Macaire Cécile
Michaud Alexis
Nguyễn Minh-Châu
Rossato Solange
Wisniewski Guillaume
Publication venue: HAL CCSD
Publication date: 13/06/2022
Field of study

International audienceWe describe in this work the latest results obtained in interdisciplinary work to support "fundamental language documentation" through the use of speech recognition tools. Specifically, the focus is on the development of a speech recognition system for Japhug, an endangered minority language of China. The practical goal is to reduce the transcription workload of field linguists. We show how a new deep learning approach based on the language-specific tuning of a generic pre-trained representation model, XLS-R, using a Transformer architecture, significantly improves the quality of phonemic transcription, in a setting where only a few hours of annotated data are available. Most significantly, this method allows for reaching the stage of automatic word recognition. Nevertheless, we note difficulties in implementation, in terms of learning stability. The question of the evaluation of the tool by field linguists is also addressed.Nous décrivons dans ce travail des résultats obtenus dans le cadre d'explorations interdisciplinaires visant à venir en appui aux linguistes « de terrain » au moyen d'outils de Reconnaissance Automatique de la Parole. Spécifiquement, nous nous focalisons sur le développement d'un système de reconnaissance de la parole pour le japhug, langue rare de Chine. L'objectif consiste à réduire l'effort de transcription des linguistes « de terrain ». Nous montrons comment une nouvelle approche neuronale fondée sur la spécialisation d'un modèle de représentation générique pré-entraîné multilingue XLS-R reposant sur une architecture de type Transformer permet d'améliorer significativement la qualité de la transcription phonémique dans le cas où seules quelques heures de données annotées sont disponibles, et surtout de progresser jusqu'à la reconnaissance automatique de mots. Nous relevons néanmoins des difficultés de mise en oeuvre, en termes de stabilité de l'apprentissage. La question de l'évaluation de l'outil par les linguistes de terrain est également abordée

Hal - Université Grenoble Alpes

A Stromal Immune Module Correlated with the Response to Neoadjuvant Chemotherapy, Prognosis and Lymphocyte Infiltration in HER2-Positive Breast Carcinoma Is Inversely Correlated with Hormonal Pathways

Author: Alice Pinheiro (609859)
Anne-Sophie Hamy (3141624)
Benjamin Sadacca (3599459)
Cecile Laurent (3599453)
Fabien Reyal (53750)
Hélène Bonsang-Kitzis (839019)
Judith Abecassis (3599456)
Marick Lae (86533)
Marion Galliot (3599450)
Matahi Moarii (609858)
Publication venue
Publication date: 22/12/2016
Field of study

<div>IntroductionHER2-positive breast cancer (BC) is a heterogeneous group of aggressive breast cancers, the prognosis of which has greatly improved since the introduction of treatments targeting HER2. However, these tumors may display intrinsic or acquired resistance to treatment, and classifiers of HER2-positive tumors are required to improve the prediction of prognosis and to develop novel therapeutic interventions.MethodsWe analyzed 2893 primary human breast cancer samples from 21 publicly available datasets and developed a six-metagene signature on a training set of 448 HER2-positive BC. We then used external public datasets to assess the ability of these metagenes to predict the response to chemotherapy (Ignatiadis dataset), and prognosis (METABRIC dataset).ResultsWe identified a six-metagene signature (138 genes) containing metagenes enriched in different gene ontologies. The gene clusters were named as follows: Immunity, Tumor suppressors/proliferation, Interferon, Signal transduction, Hormone/survival and Matrix clusters. In all datasets, the Immunity metagene was less strongly expressed in ER-positive than in ER-negative tumors, and was inversely correlated with the Hormonal/survival metagene. Within the signature, multivariate analyses showed that strong expression of the “Immunity” metagene was associated with higher pCR rates after NAC (OR = 3.71[1.28–11.91], p = 0.019) than weak expression, and with a better prognosis in HER2-positive/ER-negative breast cancers (HR = 0.58 [0.36–0.94], p = 0.026). Immunity metagene expression was associated with the presence of tumor-infiltrating lymphocytes (TILs).ConclusionThe identification of a predictive and prognostic immune module in HER2-positive BC confirms the need for clinical testing for immune checkpoint modulators and vaccines for this specific subtype. The inverse correlation between Immunity and hormone pathways opens research perspectives and deserves further investigation.</div

Directory of Open Access Journals

PubMed Central

FigShare