1,079 research outputs found

    Terminology Extraction for and from Communications in Multi-disciplinary Domains

    Get PDF
    Terminology extraction generally refers to methods and systems for identifying term candidates in a uni-disciplinary and uni-lingual environment such as engineering, medical, physical and geological sciences, or administration, business and leisure. However, as human enterprises get more and more complex, it has become increasingly important for teams in one discipline to collaborate with others from not only a non-cognate discipline but also speaking a different language. Disaster mitigation and recovery, and conflict resolution are amongst the areas where there is a requirement to use standardised multilingual terminology for communication. This paper presents a feasibility study conducted to build terminology (and ontology) in the domain of disaster management and is part of the broader work conducted for the EU project Sland \ub4 ail (FP7 607691). We have evaluated CiCui (for Chinese name \ub4 \u8bcd\u8403, which translates to words gathered), a corpus-based text analytic system that combine frequency, collocation and linguistic analyses to extract candidates terminologies from corpora comprised of domain texts from diverse sources. CiCui was assessed against four terminology extraction systems and the initial results show that it has an above average precision in extracting terms

    Technology and e-resources for legal translators : the LAW10n project

    Get PDF
    End User License Agreements are "those agreements as a result of which the licensee, purchaser of the license or user, receives from the licensor the right to use the programs under the terms agreed2"(Aparicio 2004:71). Software licenses first appeared in the United States of America. Translated into Spanish by the Licensor, and made available directly to users of the licensed software, these licensing agreements have now been incorporated into Spanish law. In legal translation ―in particular when translating End User License Agreements where the specificity of the cultural elements involved can lead to recurrent breakdowns in communication― an interpretative-communicative approach must be used, one in which the translator takes into consideration all the elements that directly impinge upon the decision-making process in translation, i.e., the client; target audience; legal or cultural context; legal requirements enforceable by law, etc. In practice, licensing agreements are translated as part of the process of localisation itself, i.e. semi-automatically. As a result, licensing agreements translated into Spanish do not reflect the spirit of the law underlying the source text; neither do they comply with the specific requirements of Spanish law. Although there is a gender of license agreements in Spanish -i.e. in patent law and other copyright law fields- this gender cannot be automatically applied to the case of software licenses because these licenses have special features. In this article we present the reason why the translation of software licence agreements deserves such a deep analysis and how existing legal e-resources are not enough to solve the translation challenges that this genre presents to translators. An English-Spanish bilingual corpus of translations has been created and analysed to evidence the legal implications of current translations and demonstrate the need to take into account not only the legal system of the target text but also translation proposals included in licences where the applicable law is that of the target culture. This article is addressed to translation lecturers and researchers interested in legal e-resources and instrumental translations

    Tagging, Folksonomy & Co - Renaissance of Manual Indexing?

    Get PDF
    This paper gives an overview of current trends in manual indexing on the Web. Along with a general rise of user generated content there are more and more tagging systems that allow users to annotate digital resources with tags (keywords) and share their annotations with other users. Tagging is frequently seen in contrast to traditional knowledge organization systems or as something completely new. This paper shows that tagging should better be seen as a popular form of manual indexing on the Web. Difference between controlled and free indexing blurs with sufficient feedback mechanisms. A revised typology of tagging systems is presented that includes different user roles and knowledge organization systems with hierarchical relationships and vocabulary control. A detailed bibliography of current research in collaborative tagging is included.Comment: Preprint. 12 pages, 1 figure, 54 reference

    A framework of analysis for the evaluation of automatic term extractors

    Full text link
    [EN] Following previous research on automatic term extraction, the primary aim of this paper is to propose a more robust and consistent framework of analysis for the comparative evaluation of term extractors. Within the different views for software quality outlined in ISO standards, our proposal focuses on the criterion of external quality and in particular on the characteristics of functionality, usability and efficiency together with the subcharacteristics of suitability, precision, operability and time behavior. The evaluation phase is completed by comparing four online open-access automatic term extractors: TermoStat, GaleXtract, BioTex and DEXTER. This latter resource forms part of the virtual functional laboratory for natural language processing (FUNK Lab) developed by our research group. Furthermore, the results obtained from the comparative analysis are discussed.Financial support for this research has been provided by the Spanish Ministry of Economy, Competitiveness and Science, grant FFI2014-53788-C3-1-P.Periñán-Pascual, C.; Mairal-Usón, R. (2018). A framework of analysis for the evaluation of automatic term extractors. VIAL. Vigo International Journal of Applied Linguistics. 15:105-125. https://doi.org/10.35869/vial.v0i15.88S1051251

    Cómo los corpus pueden asistir a los estudiantes de traducción jurídica : la plataforma GENTT TransTools Corpora y Sketch Engine

    Get PDF
     This paper analyses the application of corpora to the teaching of legal translation in higher education settings combining the use of both the GENTT TransTools Corpora platform and Sketch Engine. A review of previous teaching experiences with legal textual corpora is presented, followed by a descriptive overview of GENTT?s research group 10 years? experience using corpus in the classroom with a translation training approach that promotes scaffolded education as well as constructive and cooperative situated learning. These suggest that classroom activities with monolingual, multilingual and translated corpora of legal documents may prove useful to students of legal translation, improving their strategic competence and providing them with text models and patterns to be used as terminological, textual and legal/conceptual references

    Ontologies across disciplines

    Get PDF

    The Web as a Corpus and for Building corpora in the Teaching of Specialised Translation: The Example of Texts in Healthcare

    Get PDF
    Abstract: One of the key issues faced by translators and translation students of specialised texts is finding the equivalents of terms in L2 of the field in question. A greater challenge, however, is the formation of the textual environment with the appropriate collocations (adjectives, nouns, verbs) for those terms in the language for special purposes (LSP). The web offers the most convenient and immediate solution by providing access to updated language data presenting the terms in original contexts that help overcome the shortcomings of hard copy lexicographic resources. Taking into account the importance of documentation skills in the training of translators of specialised texts, this paper examines the use of the Web as a Mega Corpus that can be read directly with Google and as a means for constructing corpora automatically with the help of the WebBootCat software. The texts dealt with in this paper are from the healthcare field, which is an important sector of the public service. Resumen: Uno de los retos clave a que se enfrentan los traductores de textos especializados y los estudiantes de traducción es encontrar los equivalentes de términos en la L2 del área en cuestión. Sin embargo, aún mayor resulta el reto de conformar el ambiente textual con las colocaciones apropiadas (adjetivos, substantivos, verbos) alrededor de esos términos. La red ofrece la solución más conveniente e inmediata al otorgar acceso a datos lingüísticos actualizados que presentan los términos en contextos originales que ayudan a pasarse de las deficiencias de los recursos lexicográficos en forma de libro. Tomando en consideración la importancia de las capacidades de documentarse en la formación de traductores de textos especializados, en este artículo se examinará el uso de la Red como un Mega Corpus que se puede leer directamente con Google y como medio de construcción de córpora de manera automática con la ayuda del soporte WebBootCat. Los textos tratados en este trabajo provienen del área de la salud, que es un sector importante de los servicios públicos

    A quest for the right word enhancing reflexivity and technology in terminology training

    Get PDF
    INTED2010, the 4th International Technology, Education and Development Conference was held in Valencia (Spain), on March 8, 9 and 10, 2010.When it comes to translators training, the acquisition of indexing and terminological competences (both at retrieval and management stage) has a major role in the performance of future translators. A good terminological database, as a result of an accurate research, along with computer assisted translation tools (CAT tools) can improve translation’s speed and quality and also reduce revision costs, bringing in benefits for all the players in the translation industry: language service providers and clients. That process (analysis, selection, retrieval and storage of terminology) takes place mostly in the pretranslation stage, but underlies the whole translation work and can be a determining factor to the quality of the final product and to its homogeneity, especially when carried out in a collaborative environment. The development of terminological databases is an essential step in the training of translators and the efficient search for the right word a necessary skill in today's globalised translation market. Moreover being the quest for the right word almost entirely run over the Internet, data diversity can greatly increase the noise. This search poses several questions, mainly (1) how and where to retrieve information and (2) how to manage it efficiently, especially to students who are neither experts in terminology nor in translation. To ease some of these problems, students were assigned a project in terminology (a database) and, in order to accomplish it, both a Webquest and an ePortfolio were proposed as guidance tools. Along the process, students were expected to build up their thematic and communicative competence and, in parallel, widen their skills in computer-assisted translation tools as well as standard officeautomation software. This paper aims at discussing how these two tools helped students guide their research, structure the problem solving activities, develop critical thinking and terminological competencies

    Creación de datos multilingües para diversos enfoques basados en corpus en el ámbito de la traducción y la interpretación

    Get PDF
    Accordingly, this research work aims at exploiting and developing new technologies and methods to better ascertain not only translators’ and interpreters’ needs, but also professionals’ and ordinary people’s on their daily tasks, such as corpora and terminology compilation and management. The main topics covered by this work relate to Computational Linguistics (CL), Natural Language Processing (NLP), Machine Translation (MT), Comparable Corpora, Distributional Similarity Measures (DSM), Terminology Extraction Tools (TET) and Terminology Management Tools (TMT). In particular, this work examines three main questions: 1) Is it possible to create a simpler and user-friendly comparable corpora compilation tool? 2) How to identify the most suitable TMT and TET for a given translation or interpreting task? 3) How to automatically assess and measure the internal degree of relatedness in comparable corpora? This work is composed of thirteen peer-reviewed scientific publications, which are included in Appendix A, while the methodology used and the results obtained in these studies are summarised in the main body of this document. Fecha de lectura de Tesis Doctoral: 22 de noviembre 2019Corpora are playing an increasingly important role in our multilingual society. High-quality parallel corpora are a preferred resource in the language engineering and the linguistics communities. Nevertheless, the lack of sufficient and up-to-date parallel corpora, especially for narrow domains and poorly-resourced languages is currently one of the major obstacles to further advancement across various areas like translation, language learning and, automatic and assisted translation. An alternative is the use of comparable corpora, which are easier and faster to compile. Corpora, in general, are extremely important for tasks like translation, extraction, inter-linguistic comparisons and discoveries or even to lexicographical resources. Its objectivity, reusability, multiplicity and applicability of uses, easy handling and quick access to large volume of data are just an example of their advantages over other types of limited resources like thesauri or dictionaries. By a way of example, new terms are coined on a daily basis and dictionaries cannot keep up with the rate of emergence of new terms
    • …
    corecore