263 research outputs found

    Creating Lexical Resources in TEI P5 : a Schema for Multi-purpose Digital Dictionaries

    Get PDF
    Although most of the relevant dictionary productions of the recent past have relied on digital data and methods, there is little consensus on formats and standards. The Institute for Corpus Linguistics and Text Technology (ICLTT) of the Austrian Academy of Sciences has been conducting a number of varied lexicographic projects, both digitising print dictionaries and working on the creation of genuinely digital lexicographic data. This data was designed to serve varying purposes: machine-readability was only one. A second goal was interoperability with digital NLP tools. To achieve this end, a uniform encoding system applicable across all the projects was developed. The paper describes the constraints imposed on the content models of the various elements of the TEI dictionary module and provides arguments in favour of TEI P5 as an encoding system not only being used to represent digitised print dictionaries but also for NLP purposes

    The construction of a linguistic linked data framework for bilingual lexicographic resources

    Get PDF
    Little-known lexicographic resources can be of tremendous value to users once digitised. By extending the digitisation efforts for a lexicographic resource, converting the human readable digital object to a state that is also machine-readable, structured data can be created that is semantically interoperable, thereby enabling the lexicographic resource to access, and be accessed by, other semantically interoperable resources. The purpose of this study is to formulate a process when converting a lexicographic resource in print form to a machine-readable bilingual lexicographic resource applying linguistic linked data principles, using the English-Xhosa Dictionary for Nurses as a case study. This is accomplished by creating a linked data framework, in which data are expressed in the form of RDF triples and URIs, in a manner which allows for extensibility to a multilingual resource. Click languages with characters not typically represented by the Roman alphabet are also considered. The purpose of this linked data framework is to define each lexical entry as “historically dynamic”, instead of “ontologically static” (Rafferty, 2016:5). For a framework which has instances in constant evolution, focus is thus given to the management of provenance and linked data generation thereof. The output is an implementation framework which provides methodological guidelines for similar language resources in the interdisciplinary field of Library and Information Science

    A Comparative Study on the Effects of Using of TPRC and PLAN Strategies on Students’ Reading Comprehension at Language Development Center of UIN Suska Riau.

    Get PDF
    The main objective of this study was to compare the ability of the students to understand texts by using TPRC strategy and PLAN strategy. The research was comparative experimental with quantitative method. The subject of the research was the students majoring in Accounting Level I at Language Development Center of UIN SUSKA Riau in the academic year of 2016/2017. The object of the research was the comparison of students' ability to understand English texts using TPRC strategy and PLAN strategy. Cluster sampling technique was employed to determine the sample of the study. Two classes consisting of 50 students were taken as the sample which were divided into experimental group 1 and the experimental group 2. A pre-test and a post-test were given to both the experimental group 1 and the experimental group 2. Independent t-test and paired sample t-test were used to analyze the data. The results showed that there was a significant difference of the value of students' reading post-test between the experimental group 1 taught using TPRC and experimental group 2 taught using PLAN. The result of T-test was 2.91, df 48, the SD of the experimental group 1 was 10.57 and the experimental group 2 was 7.88. It was discovered that p = 0.005, twotailed value of less than 0.05 (p <0.05). Therefore, the null hypothesis was rejected and the alternative hypothesis was accepted. It was also proven that TPRC reading strategy contributed to the enhancement of students' reading comprehension by 32% and PLAN contributed as much as 58%. Therefore, it could be concluded that PLAN strategy gave better results than TPRC strategy

    Getting More out of Biomedical Documents with GATE's Full Lifecycle Open Source Text Analytics.

    Get PDF
    This software article describes the GATE family of open source text analysis tools and processes. GATE is one of the most widely used systems of its type with yearly download rates of tens of thousands and many active users in both academic and industrial contexts. In this paper we report three examples of GATE-based systems operating in the life sciences and in medicine. First, in genome-wide association studies which have contributed to discovery of a head and neck cancer mutation association. Second, medical records analysis which has significantly increased the statistical power of treatment/ outcome models in the UK’s largest psychiatric patient cohort. Third, richer constructs in drug-related searching. We also explore the ways in which the GATE family supports the various stages of the lifecycle present in our examples. We conclude that the deployment of text mining for document abstraction or rich search and navigation is best thought of as a process, and that with the right computational tools and data collection strategies this process can be made defined and repeatable. The GATE research programme is now 20 years old and has grown from its roots as a specialist development tool for text processing to become a rather comprehensive ecosystem, bringing together software developers, language engineers and research staff from diverse fields. GATE now has a strong claim to cover a uniquely wide range of the lifecycle of text analysis systems. It forms a focal point for the integration and reuse of advances that have been made by many people (the majority outside of the authors’ own group) who work in text processing for biomedicine and other areas. GATE is available online ,1. under GNU open source licences and runs on all major operating systems. Support is available from an active user and developer community and also on a commercial basis
    • …
    corecore