519 research outputs found

    Improving the translation environment for professional translators

    Get PDF
    When using computer-aided translation systems in a typical, professional translation workflow, there are several stages at which there is room for improvement. The SCATE (Smart Computer-Aided Translation Environment) project investigated several of these aspects, both from a human-computer interaction point of view, as well as from a purely technological side. This paper describes the SCATE research with respect to improved fuzzy matching, parallel treebanks, the integration of translation memories with machine translation, quality estimation, terminology extraction from comparable texts, the use of speech recognition in the translation process, and human computer interaction and interface design for the professional translation environment. For each of these topics, we describe the experiments we performed and the conclusions drawn, providing an overview of the highlights of the entire SCATE project

    Planning non existent dictionaries

    Get PDF
    In 2013, a conference entitled Planning non-existent dictionaries was held at the University of Lisbon. Scholars and lexicographers were invited to present and submit for discussion their research and practices, focusing on aspects that are traditionally perceived as shortcomings by dictionary makers and dictionary users. This book contains a collection of papers divided in three sections. The first section is devoted to heritage dictionaries, referring to lexicographic projects that aim to register all the documented words in a language, particularly those that can be described as early linguistic evidence. The second section is devoted to dictionaries for special purposes and it gathers papers that describe innovative lexicographic projects. The last section in this volume provides an overview of contemporary e- lexicography projects.publishe

    A Short History of Polish-Ukrainian Terminography

    Get PDF
    A Short History of Polish-Ukrainian Terminography Specialised dictionaries fulfil a plethora of linguistic and cognitive functions in specialised communication. In particular, such reference works help introduce, harmonise and standardise national terminologies, thus playing an indispensable role in disseminating high quality specialised knowledge. An even more important role may be attributed to bilingual and multilingual specialised dictionaries, whose primary goal is to facilitate the flow of scientific and technical information at an international level. This function has come to the fore in today’s multinational and interconnected professional world. In light of the developing and ever stronger cooperation between Poland and Ukraine, an attempt is being made to evaluate bilingual and multilingual terminographic works containing Polish and Ukrainian which have been published in Poland to date. The aim is to assess the positive developments and to identify the gaps in Polish-Ukrainian terminography. It is hoped that the findings presented in this paper will be applied by terminographers in order to compile terminological dictionaries of higher quality, which satisfy the needs of specific users and follow terminographic principles.   Krótka historia terminografii polsko-ukraińskiej Słowniki specjalistyczne pełnią ważne funkcje lingwistyczne i kognitywne w komunikacji specjalistycznej. W szczególności dzieła terminograficzne umożliwiają rozpowszechnianie, harmonizację i standaryzację terminologii narodowych, stając się niezbędnym narzędziem w transferze wysokiej jakości wiedzy specjalistycznej. Jeszcze ważniejszą rolę można przypisać dwu- i wielojęzycznym słownikom specjalistycznym, których prymarnym celem jest ułatwienie przepływu informacji naukowej i technicznej w wymiarze globalnym, szczególnie w dzisiejszym jednoczącym się, wielonarodowościowym świecie. W świetle rozwijającej się coraz ściślejszej współpracy między Polską i Ukrainą w niniejszym artykule autor podejmuje się próby oceny opublikowanych w Polsce dwu- i wielojęzycznych dzieł terminograficznych z językami polskim i ukraińskim. Celem badania była ewaluacja pozytywnych oraz negatywnych aspektów praktyki terminograficznej polsko-ukraińskiej. Autor ma nadzieję, że wnioski przedstawione w niniejszej pracy znajdą zastosowanie w praktyce przy tworzeniu słowników coraz lepszej jakości, odpowiadających potrzebom konkretnych grup odbiorców oraz stosujących zasady współczesnej terminografii

    Creación de datos multilingües para diversos enfoques basados en corpus en el ámbito de la traducción y la interpretación

    Get PDF
    Accordingly, this research work aims at exploiting and developing new technologies and methods to better ascertain not only translators’ and interpreters’ needs, but also professionals’ and ordinary people’s on their daily tasks, such as corpora and terminology compilation and management. The main topics covered by this work relate to Computational Linguistics (CL), Natural Language Processing (NLP), Machine Translation (MT), Comparable Corpora, Distributional Similarity Measures (DSM), Terminology Extraction Tools (TET) and Terminology Management Tools (TMT). In particular, this work examines three main questions: 1) Is it possible to create a simpler and user-friendly comparable corpora compilation tool? 2) How to identify the most suitable TMT and TET for a given translation or interpreting task? 3) How to automatically assess and measure the internal degree of relatedness in comparable corpora? This work is composed of thirteen peer-reviewed scientific publications, which are included in Appendix A, while the methodology used and the results obtained in these studies are summarised in the main body of this document. Fecha de lectura de Tesis Doctoral: 22 de noviembre 2019Corpora are playing an increasingly important role in our multilingual society. High-quality parallel corpora are a preferred resource in the language engineering and the linguistics communities. Nevertheless, the lack of sufficient and up-to-date parallel corpora, especially for narrow domains and poorly-resourced languages is currently one of the major obstacles to further advancement across various areas like translation, language learning and, automatic and assisted translation. An alternative is the use of comparable corpora, which are easier and faster to compile. Corpora, in general, are extremely important for tasks like translation, extraction, inter-linguistic comparisons and discoveries or even to lexicographical resources. Its objectivity, reusability, multiplicity and applicability of uses, easy handling and quick access to large volume of data are just an example of their advantages over other types of limited resources like thesauri or dictionaries. By a way of example, new terms are coined on a daily basis and dictionaries cannot keep up with the rate of emergence of new terms

    Biomedical term extraction: overview and a new methodology

    Get PDF
    International audienceTerminology extraction is an essential task in domain knowledge acquisition, as well as for Information Retrieval (IR). It is also a mandatory first step aimed at building/enriching terminologies and ontologies. As often proposed in the literature, existing terminology extraction methods feature linguistic and statistical aspects and solve some problems related (but not completely) to term extraction, e.g. noise, silence, low frequency, large-corpora, complexity of the multi-word term extraction process. In contrast, we propose a cutting edge methodology to extract and to rank biomedical terms, covering the all mentioned problems. This methodology offers several measures based on linguistic, statistical, graphic and web aspects. These measures extract and rank candidate terms with excellent precision: we demonstrate that they outperform previously reported precision results for automatic term extraction, and work with different languages (English, French, and Spanish). We also demonstrate how the use of graphs and the web to assess the significance of a term candidate, enables us to outperform precision results. We evaluated our methodology on the biomedical GENIA and LabTestsOnline corpora and compared it with previously reported measures
    corecore