4,088 research outputs found

    Ensuring Readability and Data-fidelity using Head-modifier Templates in Deep Type Description Generation

    Full text link
    A type description is a succinct noun compound which helps human and machines to quickly grasp the informative and distinctive information of an entity. Entities in most knowledge graphs (KGs) still lack such descriptions, thus calling for automatic methods to supplement such information. However, existing generative methods either overlook the grammatical structure or make factual mistakes in generated texts. To solve these problems, we propose a head-modifier template-based method to ensure the readability and data fidelity of generated type descriptions. We also propose a new dataset and two automatic metrics for this task. Experiments show that our method improves substantially compared with baselines and achieves state-of-the-art performance on both datasets.Comment: ACL 201

    GLIMPSED:Improving natural language processing with gaze data

    Get PDF

    The Effect of Text Authenticity on the Performance of Iranian EFL Students in a C-Test

    Get PDF
    As part of growing efforts to understand factors affecting c-test this study aims to investigate the effect of text authenticity on the performance of Iranian EFL students in a C-Test. The C-Test is an integrative testing instrument that measures overall language competence, very much like the cloze test. In this study the rule of two has been applied: "the second half of every second word has been deleted, beginning with the second word of the second sentence; the first and last sentences are left intact" (Katona and Dornyei 1993: 35). The research involves 60 college students in their third year, majoring in English Literature at Ershad-Damavand College. This group were randomly selected applying multi-stage sampling. Since the present study intended to investigate the role of two different formats, i.e. authentic and inauthentic texts (text translated from Persian into English), two different tailored C-Tests were made to measure and compare the performances of the participants. Two C-Tests, one with Authentic Text and the other, with Inauthentic Text were administered to this homogenized group comprising 30 subjects. The findings of this study suggest that authenticity has an effect on the performance of learners in c-tests and we should control this variable while devising a c-test

    Using distributional similarity to organise biomedical terminology

    Get PDF
    We investigate an application of distributional similarity techniques to the problem of structural organisation of biomedical terminology. Our application domain is the relatively small GENIA corpus. Using terms that have been accurately marked-up by hand within the corpus, we consider the problem of automatically determining semantic proximity. Terminological units are dened for our purposes as normalised classes of individual terms. Syntactic analysis of the corpus data is carried out using the Pro3Gres parser and provides the data required to calculate distributional similarity using a variety of dierent measures. Evaluation is performed against a hand-crafted gold standard for this domain in the form of the GENIA ontology. We show that distributional similarity can be used to predict semantic type with a good degree of accuracy

    Text reading in English as a second language: Evidence from the Multilingual Eye-Movements Corpus

    Get PDF
    Research into second language (L2) reading is an exponentially growing field. Yet, it still has a relatively short supply of comparable, ecologically valid data from readers representing a variety of first languages (L1). This article addresses this need by presenting a new data resource called MECO L2 (Multilingual Eye Movements Corpus), a rich behavioral eye-tracking record of text reading in English as an L2 among 543 university student speakers of 12 different L1s.MECO L2 includes a test battery of component skills of reading and allows for a comparison of the participants’ reading performance in their L1 and L2. This data resource enables innovative large-scale cross-sample analyses of predictors of L2 reading fluency and comprehension. We first introduce the design and structure of the MECO L2 resource, along with reliability estimates and basic descriptive analyses. Then, we illustrate the utility of MECO L2 by quantifying contributions of four sources to variability in L2 reading proficiency proposed in prior literature: reading fluency and comprehension in L1, proficiency in L2 component skills of reading, extralinguistic factors, and the L1 of the readers. Major findings included (a) a fundamental contrast between the determinants of L2 reading fluency versus comprehension accuracy, and (b) high within-participant consistency in the real-time strategy of reading in L1 and L2.We conclude by reviewing the implications of these findings to theories of L2 acquisition and outline further directions in which the new data resourcemay support L2 reading research.Este artículo se encuentra publicado en Studies in Second Language Acquisition, 45(1), 3-37

    An investigative study of English vocabulary acquisition patterns in adult L2 tertiary learners with Chinese/Malay L1

    Get PDF
    This study investigates patterns of second language (L2) learners’ vocabulary acquisition of English in pedagogical contexts, and develops a vocabulary acquisition model, specifically a pre-receptive to productive vocabulary (PR-PV) model which analyses the patterns of inferencing strategies, the role of context on the strategies, and the influence of teaching explicit strategies on vocabulary development. Research in the area of vocabulary development is unclear on the interrelationships among various aspects of lexical competence, learning, and production processes in second language lexical acquisition. Models of vocabulary acquisition in English as a second language are scarce and the lack often prompts L2 researchers to draw from first language vocabulary study models to correlate vocabulary developmental patterns. Research is also uncertain about how L2 learners respond to reading texts however, it is quite clear that the receptive vocabulary of L2 learners is larger than productive vocabulary.The study employed a mixed-method research approach and the findings suggest that both content and context play significant roles in the extent to which L2 learners interact efficiently with reading texts. The findings from the study may have pedagogical and theoretical implications for curriculum developers, instructors and policy makers in second language tertiary English learning contexts
    corecore