1,025 research outputs found

    Lexical typology : a programmatic sketch

    Get PDF
    The present paper is an attempt to lay the foundation for Lexical Typology as a new kind of linguistic typology.1 The goal of Lexical Typology is to investigate crosslinguistically significant patterns of interaction between lexicon and grammar

    D5.3 Overview of Online Tutorials and Instruction Manuals

    Get PDF
    UIDB/03213/2020 UIDP/03213/2020The ELEXIS Curriculum is an integrated set of training materials which contextualizes ELEXIS tools and services inside a broader, systematic pedagogic narrative. This means that the goal of the ELEXIS Curriculum is not simply to inform users about the functionalities of particular tools and services developed within the project, but to show how such tools and services are a) embedded in both lexicographic theory and practice; and b) representative of and contributing to the development of digital skills among lexicographers. The scope and rationale of the curriculum are described in more detail in the Deliverable D5.2 Guidelines for Producing ELEXIS Tutorials and Instruction Manuals. The goal of this deliverable, as stated in the project DOW, is to provide ā€œa clear, structured overview of tutorials and instruction manuals developed within the project.ā€publishersversionpublishe

    The EAGLES/ISLE initiative for setting standards: the Computational Lexicon Working Group for Multilingual Lexicons

    Get PDF
    ISLE (International Standards for Language Engineering), a transatlantic standards oriented initiative under the Human Language Technology (HLT) programme, is a continuation of the long standing EAGLES (Expert Advisory Group for Language Engineering Standards) initiative, carried out by European and American groups within the EU-US International Research Co-operation, supported by NSF and EC. The objective is to support HLT R&D international and national projects, and HLT industry, by developing and promoting widely agreed and urgently demanded HLT standards and guidelines for infrastructural language resources, tools, and HLT products. ISLE targets the areas of multilingual computational lexicons (MCL), natural interaction and multimodality (NIMM), and evaluation. For MCL, ISLE is working to: extend EAGLES work on lexical semantics, necessary to establish inter-language links; design standards for multilingual lexicons; develop a prototype tool to implement lexicon guidelines; create EAGLES-conformant sample lexicons and tag corpora for validation purposes; develop standardised evaluation procedures for lexicons. For NIMM, a rapidly innovating domain urgently requiring early standardisation, ISLE work is targeted to develop guidelines for: creation of NIMM data resources; interpretative annotation of NIMM data, including spoken dialogue; annotation of discourse phenomena. For evaluation, ISLE is working on: quality models for machine translation systems; maintenance of previous guidelines - in an ISO based framework. We concentrate in the paper on the Computational Lexicon Working Group, describing in detail the proposals of guidelines for the "Multilingual ISLE Lexical Entry" (MILE). We highlight some methodological principles applied in previous EAGLES, and followed in defining MILE. We also provide a description of the EU SIMPLE semantic lexicons built on the basis of previous EAGLES recommendations. Their importance is given by the fact that these lexicons are now enlarged to real-size lexicons within National Projects in 8 EU countries, thus building a really large infrastructural platform of harmonised lexicons in Europe. We will stress the relevance of standardised language resources also for the humanities applications. Numerous theories, approaches, systems are taken into account in ISLE, as any recommendation for harmonisation must build on the major contemporary approaches. Results will be widely disseminated, after validation in collaboration with EU and US HLT R&D projects, and industry. EAGLES work towards de facto standards has already allowed the field of Language Resources to establish broad consensus on key issues for some well-established areas - and will allow similar consensus to be achieved for other important areas through the ISLE project - providing thus a key opportunity for further consolidation and a basis for technological advance. EAGLES previous results in many areas have in fact already become de facto widely adopted standards, and EAGLES itself is a well-known trademark and a point of reference for HLT projects.Hosted by the Scholarly Text and Imaging Service (SETIS), the University of Sydney Library, and the Research Institute for Humanities and Social Sciences (RIHSS), the University of Sydney

    Angļu-latvieŔu leksikogrāfiskās tradīcijas kritiska analīze

    Get PDF
    Angļu-latvieÅ”u leksikogrāfiskā tradÄ«cija aizsākas 1924. gadā, kad tiek publicēta pirmā angļu-latvieÅ”u vārdnÄ«ca, tradÄ«cijas gaitā ir sastādÄ«tas apmēram divdesmit astoņas dažāda apjoma un strukturālas sarežģītÄ«bas vārdnÄ«cas. Å obrÄ«d angļu-latvieÅ”u leksikogrāfijā valda stabila un labi iesakņojusies tradÄ«cija, kas nosaka vārdnÄ«cu mega-, makro- un mikrostrukturālo iezÄ«mju kopumu. Tomēr neskatoties uz to, ka leksikogrāfiskā materiāla apjoms ir ievērojams, vārdnÄ«cas bieži tiek sastādÄ«tas, izmantojot novecojuÅ”as metodes un leksikogrāfiskos avotus. PētÄ«juma mērÄ·is ir izanalizēt angļu-latvieÅ”u leksikogrāfisko tradÄ«ciju tās attÄ«stÄ«bas posmos, ņemot vērā dažādos ārējos faktorus, kas ietekmējuÅ”i tās attÄ«stÄ«bas gaitu, kā arÄ« izcelt angļu-latvieÅ”u vārdnÄ«cām raksturÄ«gās mega-, makro- un mikrostrukturālās iezÄ«mes, kas vērojamas tradÄ«cijas attÄ«stÄ«bas gaitā, apzināt angļu-latvieÅ”u leksikogrāfijas problēmjomas un piedāvāt teorētiski pamatotus, pasaules leksikogrāfiskajā praksē pielietotus risinājumus angļu-latvieÅ”u vārdnÄ«cu kvalitātes uzlaboÅ”anai.The English-Latvian lexicographic tradition starts in 1924 with the publication of the first English-Latvian dictionary, nearly twenty eight dictionaries of various sizes and structural complexity have been compiled in the course of the tradition. At the present moment English-Latvian lexicography is ruled by a stable and well-established tradition, determining the features of the dictionariesā€™ mega-, macro- and microstructure. However, even though the volume of the lexicographic material is ample, the dictionaries are often compiled using obsolete methods and outdated lexicographic evidence. The aim of the study is to review the stages of the development of English-Latvian lexicographic tradition considering the various extra-linguistic factors which have influenced its development, as well as to single out the typical features of English- Latvian dictionaries traced throughout the tradition at the levels of their mega-, macro- and microstructure, to pinpoint the problematic aspects of English-Latvian lexicography and to offer theoretically grounded solutions for improving the quality of English-Latvian dictionaries

    Collocations in Portuguese: A corpus-based approach to lexical patterns

    Get PDF
    Collocations and, more generally, multiword expressions, have been extensively studied for the English language and a large set of resources are available in terms of linguistic description and tools for language learning. On the contrary, combinatorial resources for Portuguese are scarce, although specific types of collocations, such as light verb constructions, nominal compounds and proverbs, have been the topic of many studies. This chapter reviews different theoretical perspectives on multiword expressions and collocations in Portuguese and presents in more detail the results of the COMBINA-PT project, a corpus-based approach to the study of collocations.info:eu-repo/semantics/publishedVersio

    Corpus-based translation research: its development and implications for general, literary and Bible translation

    Get PDF
    Corpus-based translation research emerged in the late 1990s as a new area of research in the discipline of translation studies. It is informed by a specific area of linguistics known as corpus linguistics which involves the analysis of large corpora of authentic running text by means of computer software. Within linguistics, this methodology has revolutionised lexicographic practices and methods of language teaching. In translation studies this kind of research involves using computerised corpora to study translated text, not in terms of its equivalence to source texts, but as a valid object of study in its own right. Corpus-based research in translation is concerned with revealing both the universal and the specific features of translation, through the interplay of theoretical constructs and hypotheses, variety of data, novel descriptive categories and a rigorous, flexible methodology, which can be applied to inductive and deductive research, as well as product- and process-oriented studies. In this article an overview is given of the research that has led to the formation of a new subdiscipline in translation studies, called Corpus-based Translation Studies or CTS. I also demonstrate how CTS tools and techniques can be used for the analysis of general and literary translations and therefore also for Bible translations. (Acta Theologica, Supplementum 2, 2002: 70-106

    Litavsko-engleska terminoloŔka baza kibernetičke sigurnosti: načela strukturiranja i prikupljanja podataka

    Get PDF
    The aim of the paper is to present compilation and structuring principles, scope and development possibilities of the bilingual Lithuanian-English cybersecurity termbase. The paper discusses different approaches to terminology management, the best practices of which have been used to collect cybersecurity terminology and compile the termbase. Data collection has been mainly based on semasiological and corpus-driven approaches involving creation of deep learning systems trained to extract terminology from the cybersecurity corpora. To achieve systematicity and comprehensiveness of the dataset, the onomasiological and corpus-based approaches have also been incorporated in the data collection process. The termbase design decisions (its macrostructure and microstructure) have been based on onomasiological principles, while term variation has been handled by applying the descriptive approach. The termbase has been developed in the open-source cloud-based terminological management platform Terminologue. To ensure interoperability, the termbase has been exported into the TBX format and deposited into the CLARIN-LT repository. The paper also discusses possibilities of publishing terminological data as linguistic linked open data and linking it with other terminological resources and cybersecurity ontologies. The termbase is expected to be useful for cybersecurity specialists, translators, terminographers, lexicographers and the general public, as well as to contribute to the development of the Lithuanian cybersecurity terminology.Cilj je rada predstaviti načela sastavljanja dvojezične litavsko-engleske terminoloÅ”ke baze kibernetičke sigurnosti, opseg terminoloÅ”kih podataka uključenih u terminoloÅ”ku bazu i mogućnosti njezina daljnjega razvoja. U radu se raspravlja o različitim pristupima upravljanju terminologijom, od kojih su najbolje prakse koriÅ”tene za prikupljanje terminologije kibernetičke sigurnosti i sastavljanje baze pojmova. Prikupljanje podataka uglavnom se temelji na semasioloÅ”kim pristupima i pristupima vođenim korpusom koji uključuju stvaranje sustava dubokoga učenja osposobljenih za izlučivanje terminologije iz korpusa kibernetičke sigurnosti. Kako bi se postigla sustavnost i sveobuhvatnost skupa podataka, u proces prikupljanja podataka ugrađeni su onomasioloÅ”ki i korpusni pristupi. Odluke o oblikovanju pojmovne baze (njezine makrostrukture i mikrostrukture) temeljene su na onomasioloÅ”kim načelima, dok je terminoloÅ”ka varijacija rijeÅ”ena primjenom deskriptivnoga pristupa. TerminoloÅ”ka baza razvijena je u otvorenoj platformi za upravljanje terminologijom Terminologue. Kako bi se osigurala interoperabilnost, baza pojmova pretvorena je u TBX format i pohranjena u repozitorij CLARIN-LT. U radu se također raspravlja o mogućnostima objavljivanja terminoloÅ”kih podataka kao jezičnih povezanih podataka i njihova povezivanja s drugim resursima/ontologijama kibernetičke sigurnosti. Očekuje se da će izrađena baza pojmova biti korisna stručnjacima za kibernetičku sigurnost, prevoditeljima i Å”iroj javnosti, kao i da će doprinijeti razvoju terminologije kibernetičke sigurnosti u Litvi

    On semantic differences: a multivariate corpus-based study of the semantic field of inchoativity in translated and non-translated Dutch

    Get PDF
    This dissertation places the study of semantic differences in translation compared to non-translation at the centre of its concerns. To date, much research in Corpus-based Translation Studies has focused on lexical and grammatical phenomena in an attempt to reveal presumed general tendencies of translation. On the semantic level, these general tendencies have rarely been investigated. Therefore, the goal of this study is to explore whether universal tendencies of translation also exist on the semantic level, thereby connecting the framework of translation universals to semantics
    • ā€¦
    corecore