323 research outputs found

    Memory-based morphological analysis

    Get PDF

    Improving sequence segmentation learning by predicting trigrams

    Get PDF

    Meta-Learning for Phonemic Annotation of Corpora

    Get PDF
    We apply rule induction, classifier combination and meta-learning (stacked classifiers) to the problem of bootstrapping high accuracy automatic annotation of corpora with pronunciation information. The task we address in this paper consists of generating phonemic representations reflecting the Flemish and Dutch pronunciations of a word on the basis of its orthographic representation (which in turn is based on the actual speech recordings). We compare several possible approaches to achieve the text-to-pronunciation mapping task: memory-based learning, transformation-based learning, rule induction, maximum entropy modeling, combination of classifiers in stacked learning, and stacking of meta-learners. We are interested both in optimal accuracy and in obtaining insight into the linguistic regularities involved. As far as accuracy is concerned, an already high accuracy level (93% for Celex and 86% for Fonilex at word level) for single classifiers is boosted significantly with additional error reductions of 31% and 38% respectively using combination of classifiers, and a further 5% using combination of meta-learners, bringing overall word level accuracy to 96% for the Dutch variant and 92% for the Flemish variant. We also show that the application of machine learning methods indeed leads to increased insight into the linguistic regularities determining the variation between the two pronunciation variants studied.Comment: 8 page

    Discrete versus Probabilistic Sequence Classifiers for Domain-specific Entity Chunking

    Get PDF

    Internal podalic version of second twin: Improving feet identification using a simulation model.

    Get PDF
    Podalic version and breech extraction require high obstetrical expertise. Identifying fetal extremities is the first crucial step for trainees. When this skill is not polished enough, it increases the inter-twin delivery interval and can even jeopardize the whole manoeuver. We present a model for simulating and training this specific skill, with obstetrical mannequin, and 3D printed hands and feet. Five feet and five hands (five rights and five lefts of each one) were printed in 3D after initial ultrasound acquisition of a near term fetus. Each foot and hand, was individually set in a condom filled with 100 cc of water and closed with a knot. A Sophie's Mum Birth Simulator Version 4.0 de MODEL-med was placed on the edge of the table. Each hand and foot was inserted into the pelvic mannequin. An evaluation of the students' skills using this model was performed. A significant reduction of the global mean to extract the first foot and all the feet was noticed at three month of interval. This model is an option to train and assess a crucial skill for version and breech extraction

    До відома авторів

    Get PDF
    We describe TADPOLE, a modular memory-based morphosyntactic tagger and dependency parser for Dutch. Though primarily aimed at being accurate, the design of the system is also driven by optimizing speed and memory usage, using a trie-based approximation of k-nearest neighbor classification as the basis of each module. We perform an evaluation of its three main modules: a part-of-speech tagger, a morphological analyzer, and a dependency parser, trained on manually annotated material available for Dutch – the parser is additionally trained on automatically parsed data. A global analysis of the system shows that it is able to process text in linear time close to an estimated 2,500 words per second, while maintaining sufficient accuracy