1,025 research outputs found
Lexical typology : a programmatic sketch
The present paper is an attempt to lay the foundation for Lexical Typology as a new kind of linguistic typology.1 The goal of Lexical Typology is to investigate crosslinguistically significant patterns of interaction between lexicon and grammar
D5.3 Overview of Online Tutorials and Instruction Manuals
UIDB/03213/2020
UIDP/03213/2020The ELEXIS Curriculum is an integrated set of training materials which contextualizes ELEXIS tools and services inside a broader, systematic pedagogic narrative. This means that the goal of the ELEXIS Curriculum is not simply to inform users about the functionalities of particular tools and services developed within the project, but to show how such tools and services are a) embedded in both lexicographic theory and practice; and b) representative of and contributing to the development of digital skills among lexicographers. The scope and rationale of the curriculum are described in more detail in the Deliverable D5.2 Guidelines for Producing ELEXIS Tutorials and Instruction Manuals. The goal of this deliverable, as stated in the project DOW, is to provide āa clear, structured overview of tutorials and instruction manuals developed within the project.āpublishersversionpublishe
The EAGLES/ISLE initiative for setting standards: the Computational Lexicon Working Group for Multilingual Lexicons
ISLE (International Standards for Language Engineering), a transatlantic standards oriented initiative under the Human Language Technology (HLT) programme, is a continuation of the long standing EAGLES (Expert Advisory Group for Language Engineering Standards) initiative, carried out by European and American groups within the EU-US International Research Co-operation, supported by NSF and EC. The objective is to support HLT R&D international and national projects, and HLT industry, by developing and promoting widely agreed and urgently demanded HLT standards and guidelines for infrastructural language resources, tools, and HLT products. ISLE targets the areas of multilingual computational lexicons (MCL), natural interaction and multimodality (NIMM), and evaluation. For MCL, ISLE is working to: extend EAGLES work on lexical semantics, necessary to establish inter-language links; design standards for multilingual lexicons; develop a prototype tool to implement lexicon guidelines; create EAGLES-conformant sample lexicons and tag corpora for validation purposes; develop standardised evaluation procedures for lexicons. For NIMM, a rapidly innovating domain urgently requiring early standardisation, ISLE work is targeted to develop guidelines for: creation of NIMM data resources; interpretative annotation of NIMM data, including spoken dialogue; annotation of discourse phenomena. For evaluation, ISLE is working on: quality models for machine translation systems; maintenance of previous guidelines - in an ISO based framework. We concentrate in the paper on the Computational Lexicon Working Group, describing in detail the proposals of guidelines for the "Multilingual ISLE Lexical Entry" (MILE). We highlight some methodological principles applied in previous EAGLES, and followed in defining MILE. We also provide a description of the EU SIMPLE semantic lexicons built on the basis of previous EAGLES recommendations. Their importance is given by the fact that these lexicons are now enlarged to real-size lexicons within National Projects in 8 EU countries, thus building a really large infrastructural platform of harmonised lexicons in Europe. We will stress the relevance of standardised language resources also for the humanities applications. Numerous theories, approaches, systems are taken into account in ISLE, as any recommendation for harmonisation must build on the major contemporary approaches. Results will be widely disseminated, after validation in collaboration with EU and US HLT R&D projects, and industry. EAGLES work towards de facto standards has already allowed the field of Language Resources to establish broad consensus on key issues for some well-established areas - and will allow similar consensus to be achieved for other important areas through the ISLE project - providing thus a key opportunity for further consolidation and a basis for technological advance. EAGLES previous results in many areas have in fact already become de facto widely adopted standards, and EAGLES itself is a well-known trademark and a point of reference for HLT projects.Hosted by the Scholarly Text and Imaging Service (SETIS), the University of Sydney Library, and the Research Institute for Humanities and Social Sciences (RIHSS), the University of Sydney
Angļu-latvieÅ”u leksikogrÄfiskÄs tradÄ«cijas kritiska analÄ«ze
Angļu-latvieÅ”u leksikogrÄfiskÄ tradÄ«cija aizsÄkas 1924. gadÄ, kad tiek publicÄta pirmÄ
angļu-latvieÅ”u vÄrdnÄ«ca, tradÄ«cijas gaitÄ ir sastÄdÄ«tas apmÄram divdesmit astoÅas
dažÄda apjoma un strukturÄlas sarežģītÄ«bas vÄrdnÄ«cas. Å obrÄ«d angļu-latvieÅ”u
leksikogrÄfijÄ valda stabila un labi iesakÅojusies tradÄ«cija, kas nosaka vÄrdnÄ«cu mega-,
makro- un mikrostrukturÄlo iezÄ«mju kopumu. TomÄr neskatoties uz to, ka
leksikogrÄfiskÄ materiÄla apjoms ir ievÄrojams, vÄrdnÄ«cas bieži tiek sastÄdÄ«tas,
izmantojot novecojuÅ”as metodes un leksikogrÄfiskos avotus.
PÄtÄ«juma mÄrÄ·is ir izanalizÄt angļu-latvieÅ”u leksikogrÄfisko tradÄ«ciju tÄs attÄ«stÄ«bas
posmos, Åemot vÄrÄ dažÄdos ÄrÄjos faktorus, kas ietekmÄjuÅ”i tÄs attÄ«stÄ«bas gaitu, kÄ arÄ«
izcelt angļu-latvieÅ”u vÄrdnÄ«cÄm raksturÄ«gÄs mega-, makro- un mikrostrukturÄlÄs
iezÄ«mes, kas vÄrojamas tradÄ«cijas attÄ«stÄ«bas gaitÄ, apzinÄt angļu-latvieÅ”u leksikogrÄfijas
problÄmjomas un piedÄvÄt teorÄtiski pamatotus, pasaules leksikogrÄfiskajÄ praksÄ
pielietotus risinÄjumus angļu-latvieÅ”u vÄrdnÄ«cu kvalitÄtes uzlaboÅ”anai.The English-Latvian lexicographic tradition starts in 1924 with the publication of the
first English-Latvian dictionary, nearly twenty eight dictionaries of various sizes and
structural complexity have been compiled in the course of the tradition. At the present
moment English-Latvian lexicography is ruled by a stable and well-established
tradition, determining the features of the dictionariesā mega-, macro- and
microstructure. However, even though the volume of the lexicographic material is
ample, the dictionaries are often compiled using obsolete methods and outdated
lexicographic evidence.
The aim of the study is to review the stages of the development of English-Latvian
lexicographic tradition considering the various extra-linguistic factors which have
influenced its development, as well as to single out the typical features of English-
Latvian dictionaries traced throughout the tradition at the levels of their mega-,
macro- and microstructure, to pinpoint the problematic aspects of English-Latvian
lexicography and to offer theoretically grounded solutions for improving the quality
of English-Latvian dictionaries
Collocations in Portuguese: A corpus-based approach to lexical patterns
Collocations and, more generally, multiword expressions, have been extensively studied for the English language and a large set of resources are available in terms of linguistic description and tools for language learning. On the contrary, combinatorial resources for Portuguese are scarce, although specific types of collocations, such as light verb constructions, nominal compounds and proverbs, have been the topic of many studies. This chapter reviews different theoretical perspectives on multiword expressions and collocations in Portuguese and presents in more detail the results of the COMBINA-PT project, a corpus-based approach to the study of collocations.info:eu-repo/semantics/publishedVersio
Corpus-based translation research: its development and implications for general, literary and Bible translation
Corpus-based translation research emerged in the late 1990s as a new area of research in the discipline of translation studies. It is informed by a specific area of linguistics known as corpus linguistics which involves the analysis of large corpora of authentic running text by means of computer software. Within linguistics, this methodology has revolutionised lexicographic practices and methods of language teaching. In translation studies this kind of research involves using computerised corpora to study translated text, not in terms of its equivalence to source texts, but as a valid object of study in its own right. Corpus-based research in translation is concerned with revealing both the universal and the specific features of translation, through the interplay of theoretical constructs and hypotheses, variety of data, novel descriptive categories and a rigorous, flexible methodology, which can be applied to inductive and deductive research, as well as product- and process-oriented studies. In this article an overview is given of the research that has led to the formation of a new subdiscipline in translation studies, called Corpus-based Translation Studies or CTS. I also demonstrate how CTS tools and techniques can be used for the analysis of general and literary translations and therefore also for Bible translations.
(Acta Theologica, Supplementum 2, 2002: 70-106
Litavsko-engleska terminoloÅ”ka baza kibernetiÄke sigurnosti: naÄela strukturiranja i prikupljanja podataka
The aim of the paper is to present compilation and structuring principles, scope and development possibilities of the bilingual Lithuanian-English cybersecurity termbase. The paper discusses different approaches to terminology management, the best practices of which have been used to collect cybersecurity terminology and compile the termbase. Data collection has been mainly based on semasiological and corpus-driven approaches involving creation of deep learning systems trained to extract terminology from the cybersecurity corpora. To achieve systematicity and comprehensiveness of the dataset, the onomasiological and corpus-based approaches have also been incorporated in the data collection process. The termbase design decisions (its macrostructure and microstructure) have been based on onomasiological principles, while term variation has been handled by applying the descriptive approach. The termbase has been developed in the open-source cloud-based terminological management platform Terminologue. To ensure interoperability, the termbase has been exported into the TBX format and deposited into the CLARIN-LT repository. The paper also discusses possibilities of publishing terminological data as linguistic linked open data and linking it with other terminological resources and cybersecurity ontologies. The termbase is expected to be useful for cybersecurity specialists, translators, terminographers, lexicographers and the general public, as well as to contribute to the development of the Lithuanian cybersecurity terminology.Cilj je rada predstaviti naÄela sastavljanja dvojeziÄne litavsko-engleske terminoloÅ”ke baze kibernetiÄke sigurnosti, opseg terminoloÅ”kih podataka ukljuÄenih u terminoloÅ”ku bazu i moguÄnosti njezina daljnjega razvoja. U radu se raspravlja o razliÄitim pristupima upravljanju terminologijom, od kojih su najbolje prakse koriÅ”tene za prikupljanje terminologije kibernetiÄke sigurnosti i sastavljanje baze pojmova. Prikupljanje podataka uglavnom se temelji na semasioloÅ”kim pristupima i pristupima voÄenim korpusom koji ukljuÄuju stvaranje sustava dubokoga uÄenja osposobljenih za izluÄivanje terminologije iz korpusa kibernetiÄke sigurnosti. Kako bi se postigla sustavnost i sveobuhvatnost skupa podataka, u proces prikupljanja podataka ugraÄeni su onomasioloÅ”ki i korpusni pristupi. Odluke o oblikovanju pojmovne baze (njezine makrostrukture i mikrostrukture) temeljene su na onomasioloÅ”kim naÄelima, dok je terminoloÅ”ka varijacija rijeÅ”ena primjenom deskriptivnoga pristupa. TerminoloÅ”ka baza razvijena je u otvorenoj platformi za upravljanje terminologijom Terminologue. Kako bi se osigurala interoperabilnost, baza pojmova pretvorena je u TBX format i pohranjena u repozitorij CLARIN-LT. U radu se takoÄer raspravlja o moguÄnostima objavljivanja terminoloÅ”kih podataka kao jeziÄnih povezanih podataka i njihova povezivanja s drugim resursima/ontologijama kibernetiÄke sigurnosti. OÄekuje se da Äe izraÄena baza pojmova biti korisna struÄnjacima za kibernetiÄku sigurnost, prevoditeljima i Å”iroj javnosti, kao i da Äe doprinijeti razvoju terminologije kibernetiÄke sigurnosti u Litvi
On semantic differences: a multivariate corpus-based study of the semantic field of inchoativity in translated and non-translated Dutch
This dissertation places the study of semantic differences in translation compared to non-translation at the centre of its concerns. To date, much research in Corpus-based Translation Studies has focused on lexical and grammatical phenomena in an attempt to reveal presumed general tendencies of translation. On the semantic level, these general tendencies have rarely been investigated. Therefore, the goal of this study is to explore whether universal tendencies of translation also exist on the semantic level, thereby connecting the framework of translation universals to semantics
- ā¦