647 research outputs found

    Orthographic Transcription: Which Enrichment is required for Phonetization?

    No full text
    International audienceThis paper addresses the problem of the enrichment of transcriptions in the perspective of an automatic phonetization. Phonetization is the process of representing sounds with phonetic signs. There are two general ways to construct a phonetization process: rule based systems (with rules based on inference approaches or proposed by expert linguists) and dictionary based solutions which consist in storing a maximum of phonological knowledge in a lexicon. In both cases, phonetization is based on a manual transcription. Such a transcription is established on the basis of conventions that can differ depending on their working out context. This present study focuses on three different enrichments of such a transcription. Evaluations compare phonetizations obtained from automatic systems to a reference phonetized manually. The test corpus is made of three types of speech in French: conversational speech, read speech and political debate. A specific algorithm for the rule-based system is proposed to deal with enrichments. The final system obtained a phonetization of about 95.2% correct (from 3.7% to 5.6% error rates depending on the corpus)

    Recherche automatique d'hétéro-répétitions dans un dialogue oral spontané

    Get PDF
    International audienceOther-repetitions are a device involving the reproduction by a speaker of what another speaker has just said. This paper proposes a solution to automatically detect other-repetitions in French conversational dialogue. A first step of the proposed system consists in finding all possible other-repetitions in the dialogue. A second step is used to select other-repetitions which need to be kept by combining rules with speaker statistics. This automatic detection, evaluated on a one hour dialogue, shows good results according to the expected objectives : recall is 1, and precision is about 80%.Cet article propose des critètres définitoires des hétéro-répétitions qui systématisent leur recherche dans un dialogue oral spontané

    A quantitative view of feedback lexical markers in conversational French

    No full text
    International audienceThis paper presents a quantitative description of the lexical items used for linguistic feedback in the Corpus of Interactional Data (CID). The paper includes the raw figures for feedback lexical item as well as more detailed figures concerning interindividual variability. This effort is a first step before a broader analysis including more discourse situations and featuring communicative function annotation


    Get PDF
    International audiencerhion (Observatoire des organisations et des ressources humaines sous l’impact opérationnel du numérique), créé fin 2009 à la Bibliothèque nationale de France, s’est donné pour mission d’observer et d’analyser la mise en place des grands projets numériques à la BnF (numérisation de masse, DL web, Spar – préservation du numérique) et de contribuer à la définition de la collection numérique. Rassemblant des cadres opérationnels, Orhion s’est concentré sur les projets numériques et leur mise en place, avant de s’intéresser aux métiers et leurs évolutions dans un contexte numérique (signalement et magasinage numérique). Orhion travaille actuellement sur les rôles transverses.Orhion propose ici un état des lieux de ses observations autour de deux axes principaux :la mise en œuvre des projets en transversalité à la BnF, ses liens à l’organisation hiérarchique en place et avec l’organisation concrète et élargie que cela nécessite ;l’impact sur les identités professionnelles qui en découle au travers des études sur le signalement et le magasinage numérique

    Annotation automatique en syllabes d'un dialogue oral spontané

    Get PDF
    International audienceThis paper proposes a solution to identify automatically syllable boundaries in the particular context of spontaneous speech. The main goal consists in identifying syllables from a continuous stream of phonemes. At first, phoneme classes are defined to be as well-suited as possible to reduce the problem complexity. Secondly, a few number of general rules are defined. Finally, some exception rules allows to adapt the problem to the specific context of spontaneous speech. The proposed system is evaluated and compares favorably to the only two existing other systems, for French, with significant improvements. Keywords:syllable, phoneme, segmentation, rules.Cet article propose une méthode pour identifier automatiquement les frontières de syllabes dans le contexte particulier de la parole spontanée. Le principe est d'identifier les syllabes à partir d'un flux de phonèmes. Dans un premier temps, nous proposons de regrouper les phonèmes dans des classes. Nous proposons ensuite des règles de segmentation selon les suites de classes rencontrées.Cette méthode a été appliquée sur le CID, corpus conversationnel français. Les évaluations montrent que notre proposition est plus proche d'une segmentation manuelle que les 3 outils qui existent déjà

    Polyethylene glycol and prevalence of colorectal adenomas : Population-based study of 1165 patients undergoing colonoscopy

    Get PDF
    Background and aim — Dietary polyethylene glycol (PEG) is extraordinarily potent in the chemoprevention of experimental colon carcinogenesis. PEG is used to treat constipation in France and in the USA. French laxatives include Forlax® (PEG4000), Movicol® and Transipeg® (PEG3350), and Idrocol® (pluronic F68). This study tests the hypothesis that use of a PEG-based laxative might reduce the prevalence of colorectal tumors. Methods — In this population-based study, consecutive patients attending for routine total colonoscopy were enrolled during four months by the gastroenterologists of Indre-et-Loire. They were asked if they had previously taken a laxative or a NSAID. Age, gender, previous polyps, family history of colorectal cancer, constipation, digestive symptoms were also recorded. Tumors found during colonoscopy were categorized histologically. Results — Records from 1165 patients fulfilled the inclusion criteria, 607 women and 498 men, mean age 58.3. Among those, 813 had no tumor, 329 had adenomas, and 23 had carcinomas. In a univariate analysis, older age, male gender, lack of digestive symptom, and previous polyps were more common in patients with colorectal tumors. In contrast, previous Forlax® intake was more common in tumor-free patients (odds ratio (OR) any use/no use, 0.52; 95% confidence interval, 0.27-0.94). More people used Forlax®, which contains a higher dose of PEG than the other PEGlaxatives, whose ORs were smaller than one, but did not reach significance. In multivariate analysis, older age and male gender were associated with higher risk, and NSAIDs use with lower risk, of colorectal tumors. Conclusion — Forlax® users had a halved risk of colorectal tumors in univariate analysis, which suggests that PEG may prevent carcinogenesis

    Weighted-covariance factor fuzzy C-means clustering

    Get PDF
    In this paper, we propose a factor weighted fuzzy c-means clustering algorithm. Based on the inverse of a covariance factor, which assesses the collinearity between the centers and samples, this factor takes also into account the compactness of the samples within clusters. The proposed clustering algorithm allows to classify spherical and non-spherical structural clusters, contrary to classical fuzzy c-means algorithm that is only adapted for spherical structural clusters. Compared with other algorithms designed for non-spherical structural clusters, such as Gustafson-Kessel, Gath-Geva or adaptive Mahalanobis distance-based fuzzy c-means clustering algorithms, the proposed algorithm gives better numerical results on artificial and real well known data sets. Moreover, this algorithm can be used for high dimensional data, contrary to other algorithms that require the computation of determinants of large matrices. Application on Mid-Infrared spectra acquired on maize root and aerial parts of Miscanthus for the classification of vegetal biomass shows that this algorithm can successfully be applied on high dimensional data

    Frontosphenoidal synostosis: a rare cause of unilateral anterior plagiocephaly

    Get PDF
    Introduction: When a child walks in the clinic with a unilateral frontal flattening, it is usually associated in our minds with unilateral coronal synostosis. While the latter might be the most common cause of anterior plagiocephaly, it is not the only one. A patent coronal suture will force us to consider other etiologies, such as deformational plagiocephaly, or synostosis of another suture. To understand the mechanisms underlying this malformation, the development and growth of the skull base must be considered. Materials and methods: There have been few reports in the literature of isolated frontosphenoidal suture fusion, and we would like to report a series of five cases, as the recognition of this entity is important for its treatment. Conclusion: Frontosphenoidal synostosis must be searched in the absence of a coronal synostosis in a child with anterior unilateral plagiocephaly, and treated surgicall
    • …