2,711 research outputs found

    "A Framework for Descriptive Grammars"

    Get PDF

    Book Reviews

    Get PDF

    Reordering of Source Side for a Factored English to Manipuri SMT System

    Get PDF
    Similar languages with massive parallel corpora are readily implemented by large-scale systems using either Statistical Machine Translation (SMT) or Neural Machine Translation (NMT). Translations involving low-resource language pairs with linguistic divergence have always been a challenge. We consider one such pair, English-Manipuri, which shows linguistic divergence and belongs to the low resource category. For such language pairs, SMT gets better acclamation than NMT. However, SMT’s more prominent phrase- based model uses groupings of surface word forms treated as phrases for translation. Therefore, without any linguistic knowledge, it fails to learn a proper mapping between the source and target language symbols. Our model adopts a factored model of SMT (FSMT3*) with a part-of-speech (POS) tag as a factor to incorporate linguistic information about the languages followed by hand-coded reordering. The reordering of source sentences makes them similar to the target language allowing better mapping between source and target symbols. The reordering also converts long-distance reordering problems to monotone reordering that SMT models can better handle, thereby reducing the load during decoding time. Additionally, we discover that adding a POS feature data enhances the system’s precision. Experimental results using automatic evaluation metrics show that our model improved over phrase-based and other factored models using the lexicalised Moses reordering options. Our FSMT3* model shows an increase in the automatic scores of translation result over the factored model with lexicalised phrase reordering (FSMT2) by an amount of 11.05% (Bilingual Evaluation Understudy), 5.46% (F1), 9.35% (Precision), and 2.56% (Recall), respectively

    What we learn about language from Spoken Corpus Linguistics?

    Get PDF
    Over the last few decades, the Spoken Corpus Linguistics (SCL) has achieved a great deal in terms of quantity and quality of works (O’Keeffe, McCarthy 2010). Enormous progress has been made in the last thirty years and the increment of multimodal corpora stimulates sophisticated investigations on the relationship between the verbal and non-verbal component of spoken communication (Knight 2011). The SCL is a very vital field of research, which is able to provide essential data and tools for the advancement of language knowledge. In this article I will focus on the contribution that SCL and the resulting data provide to general linguistics. In § 2, I discuss the contribution that the SCL gives to a better understanding of linguistic variation; in § 3, I show how the SCL can improve the descriptive adequacy of grammar; finally, § 4 is dedicated to the contribution that speech data can give to a better knowledge of the grammaticality of languages. Across the article I will use mainly data from Italian corpora, but widely validated by comparison with data from corpora of other languages

    The Effect of a Metalinguistic Approach to Sentence Combining on Written Expression in Eighth Grade Science for Students who Struggle with Literacy

    Get PDF
    Recent data indicate that less than 50% of American secondary students are able to demonstrate grade-level proficiency in reading, writing, and science (National Center for Educational Statistics [NCES], 2007, 2011, 2012a, 2012b). Secondary students* are expected to develop advanced literacy skills, especially in writing, in order to be ready for college and careers. Students are expected to develop these advanced literacy skills, within all academic subjects. In other words, they are expected to develop disciplinary literacy skills. The statistics are alarming overall, but they are particularly alarming in the area of science. Students need strong literacy skills, including written expression, to be prepared for employment opportunities in science fields, which currently are being filled by graduates of other industrialized nations, who have a more advanced skill set. This loss of occupational opportunity poses a threat for the U.S. to remain globally competitive in science innovation and advancement, which ultimately secures economic prosperity. Despite these staggering concerns, there is little research conducted to evaluate effective instructional methods to develop complex writing skills in academic disciplines such as science. To address this critical issue, the present study examined the effects of a metalinguistic approach to the writing intervention of sentence combining with eighth-grade students who struggle with literacy. The researcher conducted the study in a typical science classroom in an urban American school setting. The focus of the intervention was to increase students* metalinguistic awareness of science text, to improve written sentence complexity in science, as well as the written expression and determination of comparison and contrast of science content. The study employed a quasi-experimental design. The participants consisted of an experimental group (two classes) who received the treatment during typical science instruction and a comparison group (three classes) who did not receive treatment, but participated in their typical science instruction. There were four participating teachers and 84 participating students. The researcher conducted the study over a period of seven weeks within regularly scheduled science classes. Twenty intervention sessions were conducted for a length of 20 minutes each, totaling 400 minutes or 6.6 hours. Hierarchical repeated measures ANOVA and hierarchical repeated measures MANOVA analyses revealed that the experimental group performed significantly better than the comparison group on their ability to determine similarities and differences (compare and contrast) related to science content, with a medium effect. The experimental group achieved a slightly higher marginal mean over the comparison group on their ability to combine sentences, with a small effect. Multiple statistical analyses revealed a trend of higher marginal means in favor of the experimental group over the comparison group on several measures of written sentence complexity on both the science compare and contrast writing prompt (small-medium effect) and the science expository essay (medium to large effect). One experimental class also demonstrated higher scores in their overall sentence correctness on science expository essay as compared to all the other classes. These findings suggest that sentence combining, utilizing a metalinguistic approach, may hold promise as an effective writing intervention in a content area classroom, for secondary students who struggle with literacy. Furthermore, the findings suggest that a metalinguistic approach to sentence combining can be successfully embedded within a content area class, which may result in increased concept knowledge and writing skills in that academic discipline. Implications for practice and future research directions are discussed
    corecore