162 research outputs found

    TBX en SDL MultiTerm

    Get PDF
    El intercambio de datos terminológicos entre aplicaciones de gestión terminológica se realiza mediante una serie de interfaces como, por ejemplo, TermBase eXchange (TBX = 30042). El presente artículo analiza el grado de conformidad de la recién lanzada versión SDL MultiTerm 2014 para con el estándar TBX, y propone mecanismos de conversión alternativos en casos de no conformidad.L'intercanvi de dades terminològiques entre aplicacions de gestió terminològica es porta a terme mitjançant una sèrie de interfícies, com ara TermBase eXchange (TBX = 30042). El present article analitza el grau de conformitat de la recent versió SDL MultiTerm 2014 amb l'estàndard TBX, i proposa mecanismes de conversió alternatius en casos de no conformitat.Exchanging terminological data between tools can be achieved by using exchange formats like TermBase eXchange (TBX = ISO 30042). The present article examines if the recently launched version SDL MultiTerm 2014 is in compliance with TBX and proposes alternative conversion routines in case of non-compliance

    TBX-Basic Translation-oriented Terminology Made Simple

    Get PDF
    Este artículo resulta especialmente útil para todos aquellos traductores que quieran ir más allá del glossario de dos columnas y deseen profundizar en el lenguaje de intercambio de terminología sin ver XML. Describe un formato de tabla que representa el contenido de los archivos TBX-Basic. La información en este formato se puede convertir a TBX-Basic y, si se desea, procesarla a continuación a otro formato. Palabras clave: Traducción, intercambio de terminología, XML, TBX, TBX-Basic, tabla, hoja de cálculo.Translators wanting to go beyond two-column glossaries and learn about terminology exchange but not see any XML will find this article useful. It describes a tabular format that represents the content of TBX-Basic files. Information in this format can be converted to TBX-Basic and further processed, if desired. Key words: Translation, terminology exchange, terminology interchange, XML, TBX, TBX-Basic, table, spreadsheet

    Multilingual Information Framework for Handling textual data in Digital Media

    Get PDF
    This document presents MLIF (Multi Lingual Information Framework), a high-level model for describing multilingual data across a wide range of possible applications in the translation/localization process within several multimedia domains (e.g. broadcasting interactive programs within a multilingual community)

    Litavsko-engleska terminološka baza kibernetičke sigurnosti: načela strukturiranja i prikupljanja podataka

    Get PDF
    The aim of the paper is to present compilation and structuring principles, scope and development possibilities of the bilingual Lithuanian-English cybersecurity termbase. The paper discusses different approaches to terminology management, the best practices of which have been used to collect cybersecurity terminology and compile the termbase. Data collection has been mainly based on semasiological and corpus-driven approaches involving creation of deep learning systems trained to extract terminology from the cybersecurity corpora. To achieve systematicity and comprehensiveness of the dataset, the onomasiological and corpus-based approaches have also been incorporated in the data collection process. The termbase design decisions (its macrostructure and microstructure) have been based on onomasiological principles, while term variation has been handled by applying the descriptive approach. The termbase has been developed in the open-source cloud-based terminological management platform Terminologue. To ensure interoperability, the termbase has been exported into the TBX format and deposited into the CLARIN-LT repository. The paper also discusses possibilities of publishing terminological data as linguistic linked open data and linking it with other terminological resources and cybersecurity ontologies. The termbase is expected to be useful for cybersecurity specialists, translators, terminographers, lexicographers and the general public, as well as to contribute to the development of the Lithuanian cybersecurity terminology.Cilj je rada predstaviti načela sastavljanja dvojezične litavsko-engleske terminološke baze kibernetičke sigurnosti, opseg terminoloških podataka uključenih u terminološku bazu i mogućnosti njezina daljnjega razvoja. U radu se raspravlja o različitim pristupima upravljanju terminologijom, od kojih su najbolje prakse korištene za prikupljanje terminologije kibernetičke sigurnosti i sastavljanje baze pojmova. Prikupljanje podataka uglavnom se temelji na semasiološkim pristupima i pristupima vođenim korpusom koji uključuju stvaranje sustava dubokoga učenja osposobljenih za izlučivanje terminologije iz korpusa kibernetičke sigurnosti. Kako bi se postigla sustavnost i sveobuhvatnost skupa podataka, u proces prikupljanja podataka ugrađeni su onomasiološki i korpusni pristupi. Odluke o oblikovanju pojmovne baze (njezine makrostrukture i mikrostrukture) temeljene su na onomasiološkim načelima, dok je terminološka varijacija riješena primjenom deskriptivnoga pristupa. Terminološka baza razvijena je u otvorenoj platformi za upravljanje terminologijom Terminologue. Kako bi se osigurala interoperabilnost, baza pojmova pretvorena je u TBX format i pohranjena u repozitorij CLARIN-LT. U radu se također raspravlja o mogućnostima objavljivanja terminoloških podataka kao jezičnih povezanih podataka i njihova povezivanja s drugim resursima/ontologijama kibernetičke sigurnosti. Očekuje se da će izrađena baza pojmova biti korisna stručnjacima za kibernetičku sigurnost, prevoditeljima i široj javnosti, kao i da će doprinijeti razvoju terminologije kibernetičke sigurnosti u Litvi

    TBX goes TEI -- Implementing a TBX basic extension for the Text Encoding Initiative guidelines

    Get PDF
    This paper presents an attempt to customise the TEI (Text Encoding Initiative) guidelines in order to offer the possibility to incorporate TBX (TermBase eXchange) based terminological entries within any kind of TEI documents. After presenting the general historical, conceptual and technical contexts, we describe the various design choices we had to take while creating this customisation, which in turn have led to make various changes in the actual TBX serialisation. Keeping in mind the objective to provide the TEI guidelines with, again, an onomasiological model, we try to identify the best comprise in maintaining both the isomorphism with the existing TBX Basic standard and the characteristics of the TEI framework

    The conversion of ‘Histoloogiasõnastik’ to TBX

    Get PDF
    TBX (TermBase eXchange) on XML-põhine standard terminoloogilise informatsiooni vahendamiseks erinevate arvutikeskkondade vahel. Töö eesmärkideks oli teisendada algselt TeX formaadis olev „Histoloogiasõnastik” TBX standardi spetsifikatsioonile vastavaks XML dokumendiks ning tekitada sõnastiku sissekannete vahele ristviited. TBX formaat lubab sõnastikku lihtsasti teisendada erinevatesse esitusvormidesse, ristviited teevad sõnastiku kasutamise lõppkasutajale mugavamaks. TBX dokument on terminibaas ehk terminoloogiline andmebaas. Terminibaas peaks olema kavandatud teatud mudeli järgi, mis lubab selle teisendamist erinevatesse formaatidesse ja hoiab ära terminibaasi haldamisega seotud probleeme. "Histoloogiasõnastik" peegeldab seda mudelit ja sobib seega TBX formaati teisendamiseks. Sõnastiku algandmed on TeX formaadis ning sissekandeid on neljas erinevas keeles: eesti, inglise, saksa ja vene keeles, lisaks ladinakeelsed mõisted. Algandmete ülesehitus on küllaltki lihtne kuid teisenduse tegid keeruliseks suur hulk sissekannete variatsioone ja erandeid. Teisenduseks valiti evolutsioonilise prototüüpimise metoodika. See tähendab, et alguses valmis töötav, kuid puuduliku funktsionaalsusega rakendus, mida vastavalt uutele leitud vajadustele edasi arendati. Teisendus hõlmas endas nelja erinevat etappi: * Vajaliku informatsiooni sõelumine algandmetest, väljundiks lihtne XML dokument * Ristviidete leidmine ja lõpptulemuse genereerimine, väljundiks TBX dokument * Lõpptulemuse esitamine inimesele kergestiloetaval HTML kujul, et märgata tegemata jäänud nüansse * Vigade ja erandite parandamine algandmetes Teisenduse lõpptulem on TBX spetsifikatsioonile vastav XML dokument. Seda kasutatakse "Histoloogiasõnastiku" avaldamisel Keeleveebi portaalis.TBX (TermBase eXchange) is an XML-based standard for representing and exchanging terminological data in various computer environments. The objective of this thesis is to convert ‘Histoloogiasõnastik’ (dictionary of histology, created by Ülo Hussar) to a valid TBX document. TBX format allows easier ways for transforming terminological data to various representation forms, an HTML glossary for instance. Another objective of the thesis was to generate cross-references between entries of the ‘Histoloogiasõnastik’, which would make using the dictionary more convenient for the end-user. TBX document is a termbase (terminological database). Termbase should be designed according to a particular model that allows converting it to different formats and prevents systematic errors during the creation of the database. ‘Histoloogiasõnastik’ reflects this model therefore making it possible to convert it to TBX. The original data are in TeX format, entries being rather simple in their structure but containing a lot of different variations and exceptions. The methodology used for the conversion was cyclic in its nature, consisting of four main stages: • parsing original files of the dictionary, outputting an XML representation of the data • finding cross-references and forming the TBX structure, outputting the end product of the conversion • transforming the TBX document to an HTML document, allowing easy inspection of the end result to detect errors and overlooked exceptions in the original data • correcting mistakes of the conversion process and eliminating exceptions in the original data The end result of the conversion is an XML document that is in accordance with the TBX specification and satisfies the main principles of a termbase design. The converted dictionary will be published in Keeleveeb, a portal that along with different linguistic resources also features other technical dictionaries similar to ‘Histoloogiasõnastik’

    The Croatian National Termbank STRUNA: A New Platform for Terminological Work

    Get PDF
    The development of the Croatian Special Field Terminology program (known by its Croatian acronym Struna) began in 2007 as part of an initial coordination project launched at the initiative of the Croatian Standard Language Council, and has since been financed by the Croatian Science Foundation. It is being carried out at the Institute of Croatian Language and Linguistics, which serves as the national coordinator. This paper describes the current design of the e-Struna termbank and explains the adjustments made in the database structure and in the terminographic approach, both to support and reflect the methodological issues concerning interdisciplinary and multidisciplinary work. Based on examples taken from the Croatian anthropological terminology collection special attention is given to two frequently neglected categories of terminological description: context and note

    Bridging the gap between SKOS and TBX

    Get PDF
    International audienceThis article provides an in-depth comparison and proposal for mapping between Simple KnowledgeOrganization System (SKOS) and TermBase eXchange (TBX), two important exchangestandards within the knowledge and terminology landscape. The attempt to develop an interfaceor conversion routine between SKOS and TBX is rooted in a strong demand in the language andknowledge industries for resource leverage and based on the premise that the two formalisms aregoverned by similar data models, namely the description of concepts (rather than words)

    ISOcat: Remodeling metadata for language resources

    No full text
    The Max Planck Institute for Psycholinguistics in Nijmegen, The Netherlands, is creating a state-of-the-art web environment for the ISO TC 37 (terminology and other language and content resources) metadata registry. This Data Category Registry (DCR) is called ISOcat and encompasses data categories for a broad range of language resources. Under the governance of the DCR Board, ISOcat provides an open work space for creating data category specifications, defining Data Category Selections (DCSs) (domain-specific groups of data categories), and standardising selected data categories and DCSs. Designers visualise future interactivity among the DCR, reference registries and ontological knowledge space

    Open Terminology Management and Sharing Toolkit for Federation of Terminology Databases

    Full text link
    Consolidated access to current and reliable terms from different subject fields and languages is necessary for content creators and translators. Terminology is also needed in AI applications such as machine translation, speech recognition, information extraction, and other natural language processing tools. In this work, we facilitate standards-based sharing and management of terminology resources by providing an open terminology management solution - the EuroTermBank Toolkit. It allows organisations to manage and search their terms, create term collections, and share them within and outside the organisation by participating in the network of federated databases. The data curated in the federated databases are automatically shared with EuroTermBank, the largest multilingual terminology resource in Europe, allowing translators and language service providers as well as researchers and students to access terminology resources in their most current version.Comment: LREC 202