1,808 research outputs found

    Grouping Synonyms by Definitions

    Get PDF
    We present a method for grouping the synonyms of a lemma according to its dictionary senses. The senses are defined by a large machine readable dictionary for French, the TLFi (Tr\'esor de la langue fran\c{c}aise informatis\'e) and the synonyms are given by 5 synonym dictionaries (also for French). To evaluate the proposed method, we manually constructed a gold standard where for each (word, definition) pair and given the set of synonyms defined for that word by the 5 synonym dictionaries, 4 lexicographers specified the set of synonyms they judge adequate. While inter-annotator agreement ranges on that task from 67% to at best 88% depending on the annotator pair and on the synonym dictionary being considered, the automatic procedure we propose scores a precision of 67% and a recall of 71%. The proposed method is compared with related work namely, word sense disambiguation, synonym lexicon acquisition and WordNet construction

    Towards Bilingual Term Extraction in Comparable Patents

    Get PDF
    PACLIC 23 / City University of Hong Kong / 3-5 December 200

    Proceedings of the COLING 2004 Post Conference Workshop on Multilingual Linguistic Ressources MLR2004

    No full text
    International audienceIn an ever expanding information society, most information systems are now facing the "multilingual challenge". Multilingual language resources play an essential role in modern information systems. Such resources need to provide information on many languages in a common framework and should be (re)usable in many applications (for automatic or human use). Many centres have been involved in national and international projects dedicated to building har- monised language resources and creating expertise in the maintenance and further development of standardised linguistic data. These resources include dictionaries, lexicons, thesauri, word-nets, and annotated corpora developed along the lines of best practices and recommendations. However, since the late 90's, most efforts in scaling up these resources remain the responsibility of the local authorities, usually, with very low funding (if any) and few opportunities for academic recognition of this work. Hence, it is not surprising that many of the resource holders and developers have become reluctant to give free access to the latest versions of their resources, and their actual status is therefore currently rather unclear. The goal of this workshop is to study problems involved in the development, management and reuse of lexical resources in a multilingual context. Moreover, this workshop provides a forum for reviewing the present state of language resources. The workshop is meant to bring to the international community qualitative and quantitative information about the most recent developments in the area of linguistic resources and their use in applications. The impressive number of submissions (38) to this workshop and in other workshops and conferences dedicated to similar topics proves that dealing with multilingual linguistic ressources has become a very hot problem in the Natural Language Processing community. To cope with the number of submissions, the workshop organising committee decided to accept 16 papers from 10 countries based on the reviewers' recommendations. Six of these papers will be presented in a poster session. The papers constitute a representative selection of current trends in research on Multilingual Language Resources, such as multilingual aligned corpora, bilingual and multilingual lexicons, and multilingual speech resources. The papers also represent a characteristic set of approaches to the development of multilingual language resources, such as automatic extraction of information from corpora, combination and re-use of existing resources, online collaborative development of multilingual lexicons, and use of the Web as a multilingual language resource. The development and management of multilingual language resources is a long-term activity in which collaboration among researchers is essential. We hope that this workshop will gather many researchers involved in such developments and will give them the opportunity to discuss, exchange, compare their approaches and strengthen their collaborations in the field. The organisation of this workshop would have been impossible without the hard work of the program committee who managed to provide accurate reviews on time, on a rather tight schedule. We would also like to thank the Coling 2004 organising committee that made this workshop possible. Finally, we hope that this workshop will yield fruitful results for all participants

    An analysis of The Oxford Guide to practical lexicography (Atkins and Rundell 2008)

    Get PDF
    Since at least a decade ago, the lexicographic community at large has been demanding that a modern textbook be designed - one that Would place corpora in the centre of the lexicographic enterprise. Written by two of the most respected practising lexicographers, this book has finally arrived, and delivers on very many levels. This review article presents a critical analysis of its features

    The utilization of parallel corpora for the extension of machine translation lexicons

    Get PDF
    There has recently been an increasing awareness of the importance of large collections of texts (corpora) used as resources in machine translation research. The process of creating or extending machine translation lexicons is time-consuming, difficult and costly in terms of human involvement. The contribution that corpora can make towards the reduction in cost, time and complexity has been explored by several research groups. This article describes a system that has been developed to identify word-pairs, utilizing an aligned bilingual (English-Afrikaans) corpus in order to extend a bilingual lexicon with the words and their translations that are not present in the lexicon. New translations for existing entries can be added and the system also applies grammar rules for the identification of the grammatical category of each word-pair. This system limits the involvement of the human translator and has a positive impact on the time, cost and effort needed to extend a bilingual lexicon.Keywords: alignment; bilingual corpora; corpus; extension; lexicon; machine translation; monolingual corpora; parallel corpor

    Baltic Journal of English Language, Literature and Culture, Vol.10

    Get PDF
    Kontorslokaler nyttjas generellt cirka 2500 av årets 8760 timmar. Ett vanligt problem med kontorslokaler är det termiska klimatet, antingen är det för varmt, för kallt, eller så drar det. Höga temperaturer, över ca 26°C, bidrar till trötthet, nedsatt koncentration och gör att luften känns mindre fräsch. Stora variationen av lasten mellan dag och nattetid kan också resultera i att lokalerna överventileras under nattetid och underventileras under dagtid. Syftet med examensarbetet var att undersöka och jämföra Ecoclimes komforttaks lösning med andra olika värme och kylsystem i kontorslokaler. Att undersöka vilka eventuella fördelar Ecoclimes komforttak har gällande komfort, kyla, ventilation och ur energisynpunkt. Simuleringsprogrammet IDA ICE har använts för att simulera komforten och rumstemperaturer för ett kontor och ett konferensrum i en byggnad placerad i centrala Umeå. Resultaten från simuleringar indikerar att Ecoclimes komforttak, sänker den operativa temperaturen och höjer komforten med en mindre andel missnöjda i sitt rum jämfört med andra system trots samma rumstemperatur. För att bedömma andelen missnöjda i ett rum har komfortindexet PMV(Predicted mean vote) och PPD(Predicted percentage dissatisfied) använts. Den höga passiva effekten bidrar också till mindre energianvändning av ventilationsfläktar ifall ett VAV-system med rumstempertaurreglering används. Vidare har en känslighetsanalys genomförts på komforttaken där det undersöks hur kyleffekten påverkar kyltider, temperatur och komfort. Känslighetsanalysen visar att en ökning eller minskning av kyleffekten med 10% påverkar resultaten mest under en mycket varm dag jämfört med en normalvarm. Skillnaden i komfort var dock liten, endast 0,2 procentenheter från grundfallet

    Multi-word Items in Dictionaries from a Translator's Perspective

    Get PDF
    Translation dictionaries are the tools of translators who use them to transfer from the source text to the target text, as they use them whenever they encounter puzzling words. Thus, this research investigates the degree of usefulness of these dictionaries when rendering English and Arabic multi-word items, such as idioms, collocations, phrasal/prepositional verbs, and compounds/iḍāfas. The aforementioned multi-word items are known for their metaphorical meanings and fixed structures, as both characteristics cause confusion to the translator/foreign language learner. The usefulness of the translation dictionaries was determined based on two criteria. First, by evaluating the use of these dictionaries for the rendering of the aforementioned multi-word items in undergraduate translation and lexicography classes. Second, by assessing the lexicographical documentation and treatment of these items in those dictionaries. It has been concluded that the percentages of dictionary use in advanced classes of translation were higher, which indicates the awareness of the importance of dictionaries in these classes. In addition, students of Arabic-English translation classes used dictionaries less than the English-Arabic classes since they dealt with texts of their native language and that English multi-word items were more difficult to render than the Arabic ones. Moreover, findings show that Arabic multi-word items were treated better than the English multi-word items in their respective dictionaries even though the English-Arabic dictionaries document more than the Arabic-English dictionaries
    corecore