5,504 research outputs found

    Automatic treebank-based acquisition of Arabic LFG dependency structures

    Get PDF
    A number of papers have reported on methods for the automatic acquisition of large-scale, probabilistic LFG-based grammatical resources from treebanks for English (Cahill and al., 2002), (Cahill and al., 2004), German (Cahill and al., 2003), Chinese (Burke, 2004), (Guo and al., 2007), Spanish (O’Donovan, 2004), (Chrupala and van Genabith, 2006) and French (Schluter and van Genabith, 2008). Here, we extend the LFG grammar acquisition approach to Arabic and the Penn Arabic Treebank (ATB) (Maamouri and Bies, 2004), adapting and extending the methodology of (Cahill and al., 2004) originally developed for English. Arabic is challenging because of its morphological richness and syntactic complexity. Currently 98% of ATB trees (without FRAG and X) produce a covering and connected f-structure. We conduct a qualitative evaluation of our annotation against a gold standard and achieve an f-score of 95%

    A Universal Part-of-Speech Tagset

    Full text link
    To facilitate future research in unsupervised induction of syntactic structure and to standardize best-practices, we propose a tagset that consists of twelve universal part-of-speech categories. In addition to the tagset, we develop a mapping from 25 different treebank tagsets to this universal set. As a result, when combined with the original treebank data, this universal tagset and mapping produce a dataset consisting of common parts-of-speech for 22 different languages. We highlight the use of this resource via two experiments, including one that reports competitive accuracies for unsupervised grammar induction without gold standard part-of-speech tags

    Modeling Global Syntactic Variation in English Using Dialect Classification

    Get PDF
    This paper evaluates global-scale dialect identification for 14 national varieties of English as a means for studying syntactic variation. The paper makes three main contributions: (i) introducing data-driven language mapping as a method for selecting the inventory of national varieties to include in the task; (ii) producing a large and dynamic set of syntactic features using grammar induction rather than focusing on a few hand-selected features such as function words; and (iii) comparing models across both web corpora and social media corpora in order to measure the robustness of syntactic variation across registers

    Statistical parsing of morphologically rich languages (SPMRL): what, how and whither

    Get PDF
    The term Morphologically Rich Languages (MRLs) refers to languages in which significant information concerning syntactic units and relations is expressed at word-level. There is ample evidence that the application of readily available statistical parsing models to such languages is susceptible to serious performance degradation. The first workshop on statistical parsing of MRLs hosts a variety of contributions which show that despite language-specific idiosyncrasies, the problems associated with parsing MRLs cut across languages and parsing frameworks. In this paper we review the current state-of-affairs with respect to parsing MRLs and point out central challenges. We synthesize the contributions of researchers working on parsing Arabic, Basque, French, German, Hebrew, Hindi and Korean to point out shared solutions across languages. The overarching analysis suggests itself as a source of directions for future investigations

    The Effect of Teaching by the Inductive Model on Achievement in the Arabic Language Subject for Tenth Grade Students in the Southern Mazar District

    Get PDF
    The aim of the research is to identify the effect of teaching using the inductive model on achievement in the Arabic language subject for tenth grade students in the Southern Mazar District. The research relied on the quasi-experimental approach with an experimental design based on the experimental and control groups, the experimental group on which the inductive model was applied and consisted of (28) students, and the control group that studied in the usual way and consisted of (27) students, and the post-test for achievement was applied to my groups The research results showed that the average scores of the experimental group in the achievement test were higher than the average scores of the control group on the test, and that the difference was statistically significant and in favor of the experimental group. Which indicates that there is a positive effect of using the inductive model on the academic achievement in Arabic grammar for tenth grade students. Based on the results of the study, the researcher recommended emphasizing that Arabic language teachers use the inductive model in teaching Arabic grammar, due to the positive impact on students' achievement. And the necessity of employing the inductive model in the Arabic language curricula, and building activities that students pass through the stages of the inductive model, which increases their academic achievement. Keywords: inductive model, achievement, tenth grade. DOI: 10.7176/JEP/13-31-01 Publication date: November 30th 202

    Unsupervised induction of Arabic root and pattern lexicons using machine learning

    Get PDF
    We describe an approach to building a morphological analyser of Arabic by inducing a lexicon of root and pattern templates from an unannotated corpus. Using maximum entropy modelling, we capture orthographic features from surface words, and cluster the words based on the similarity of their possible roots or patterns. From these clusters, we extract root and pattern lexicons, which allows us to morphologically analyse words. Further enhancements are applied, adjusting for morpheme length and structure. Final root extraction accuracy of 87.2% is achieved. In contrast to previous work on unsupervised learning of Arabic morphology, our approach is applicable to naturally-written, unvowelled Arabic text

    The Effect of using the integrated approach in teaching grammar in providing the secondary school students with the special products from the perspectives of their teachers

    Get PDF
    This research aims to measure the effects of the style of the integrative approach in teaching grammar for secondary level students', so that they can achieve their outcomes from their teachers' point of view. To answer the questions of the study, the population on which the study will be applied, with 90 male and female teachers who take part in correcting the exams of the general secondary certificate. A 16- item questionnaire  where most of them were prepared by Ministry of Education but with some amendments based on the opinion of the arbitrators until it became stable (appendix 1). After examining teachers' responses on the questionnaire, it was statistically treated. The results showed not effect for the integrative approach in teaching grammar for the secondary level on getting those specific outcomes, as indicated by their teachers (F = -30,952) at (0,00), means (1) which is less than the benchmark. The results showed no effect for gender on the second question (F = 0,349 at 0,728 level), which is more than (a = 0.05). Based on the results, the researcher proposed a set of recommendations. Keywords: (Actual own, Grammar, Arabic teachers, Integration, Method)
    corecore