524 research outputs found

    The European Thesaurus on International Relations and Area Studies - a multilingual resource for indexing, retrieval, and translation

    Full text link
    The multilingual European Thesaurus on International Relations and Area Studies (European Thesaurus) is a special subject thesaurus for the field of international affairs. It is intended for use in libraries and documentation centres of academic institutions and international organizations. The European Thesaurus was established in a collaborative project involving a number of leading European research institutes on international politics. It integrates the controlled terminologies of several existing thesauri. The European Thesaurus comprises about 8,200 terms and proper names from the 24 subject areas covered by the thesaurus. Because of its multilinguality, the European Thesaurus can not only be used for indexing, retrieval and terminological reference, but serves also as a translation tool for the languages represented. The establishment of cross-concordances to related thesauri extends the range of application of the European Thesaurus even further. They enable the treatment of semantic heterogeneity within subject gateways. The European Thesaurus is available both in a seven-lingual printversion as well as in an eight-lingual online-version. To reflect the changes in terminology the European Thesaurus is regularly being amended and modified. Further languages are going to be included

    Terminology Retrieval: Towards a Synergy between Thesaurus and Free Text Searching

    Full text link
    Abstract. Multilingual Information Retrieval usually forces a choice between free text indexing or indexing by means of multilingual thesaurus. However, since they share the same objectives, synergy between both approaches is possible. This paper shows a retrieval framework that make use of terminological information in free-text indexing. The Automatic Terminology Extraction task, which is used for thesauri construction, shifts to a searching of terminology and becomes an information retrieval task: Terminology Retrieval. Terminology Retrieval, then, allows cross-language information retrieval through the browsing of morpho-syntactic, semantic and translingual variations of the query. Although terminology retrieval doesn’t make use of them, controlled vocabularies become an appropriate framework for terminology retrieval evaluation.

    Tools for Terminology Processing

    Get PDF
    International audienceAutomatic terminology processing appeared 10 years ago when electronic corpora became widely available. Such processing may be statistically or linguistically based and produces terminology resources that can be used in a number of applications : indexing, information retrieval, technology watch, etc. We present the tools that have been developed in the IRIN Institute. They all take as input texts (or collection of texts) and reflect different states of terminology processing: term acquisition, term recognition and term structuring

    Could we automatically reproduce semantic relations of an information retrieval thesaurus?

    Full text link
    A well constructed thesaurus is recognized as a valuable source of semantic information for various applications, especially for Information Retrieval. The main hindrances to using thesaurus-oriented approaches are the high complexity and cost of manual thesauri creation. This paper addresses the problem of automatic thesaurus construction, namely we study the quality of automatically extracted semantic relations as compared with the semantic relations of a manually crafted thesaurus. The vector-space model based on syntactic contexts was used to reproduce relations between the terms of a manually constructed thesaurus. We propose a simple algorithm for representing both single word and multiword terms in the distributional space of syntactic contexts. Furthermore, we propose a method for evaluation quality of the extracted relations. Our experiments show significant difference between the automatically and manually constructed relations: while many of the automatically generated relations are relevant, just a small part of them could be found in the original thesaurus

    The Simple Knowledge Organization System (SKOS): a situation report for the HIVE Project

    Get PDF
    HIVE (Helping Interdisciplinary Vocabularies Engineering) es un proyecto financiado por el IMLS (Institute of Museums and Library Services), e indirectamente, en Dryad, ambos proyectos en colaboración del Metadata Research Center y el National Evolutionary Synthesis Center (NESCent) in Durham, North Carolina. Con el desarrollo de HIVE se pretende resolver esta problemática mediante una propuesta de generación automática de metadatos que permita la integración dinámica de vocabularios controlados específicos. Para asistir la integración de vocabularios se seleccionó SKOS (Simple Knowledge Organisation System), un estándar del World Wide Web Consortium (W3C) para la representación de sistemas de organización del conocimiento o vocabularios, como tesauros, esquemas de clasificación, sistemas de encabezamiento de materias y taxonomías, en el marco de la Web Semántica.El presente informe realiza un análisis exhaustivo de la situación en cuanto a la aplicación de SKOS. El estudio incluye una detallada revisión de literatura científica y recursos web sobre el modelo, una selección de los proyectos, iniciativas, herramientas, grupos de investigación claves y cualquier otro tipo de información que pudiera ser de relevancia para el logro de los objetivos del proyecto HIVE. Asimismo, se analiza la importancia de SKOS para el logro de la interoperabilidad semántica y se elaboran un conjunto de recomendaciones para los miembros del proyecto HIVE

    Design of a Controlled Language for Critical Infrastructures Protection

    Get PDF
    We describe a project for the construction of controlled language for critical infrastructures protection (CIP). This project originates from the need to coordinate and categorize the communications on CIP at the European level. These communications can be physically represented by official documents, reports on incidents, informal communications and plain e-mail. We explore the application of traditional library science tools for the construction of controlled languages in order to achieve our goal. Our starting point is an analogous work done during the sixties in the field of nuclear science known as the Euratom Thesaurus.JRC.G.6-Security technology assessmen

    Accessing Legal Information Across Boundaries: A New Challenge

    Get PDF
    In the actual multilingual and multicultural environment there is a significant need, in the academic world, in the legal profession, in business settings as well as in the context of public administration services to citizens, of common understanding and exchange of legal concepts of the various legal systems. At the same time, there is a strong pressure for the reservation of their basic sense and value. Both requirements are quite difficult to meet, and they are complicated by the complexity of legal language and by the variety of modalities used to express law within the various legal systems. Unlike a number of technical and scientific disciplines where a fair correspondence exists between concepts across languages, serious difficulties arise in interpreting law across countries and languages. This is largely due to the system-bound nature of legal terminology. This paper focuses on crosslanguage retrieval systems\u27 ability to facilitate access to legal information across different languages and legal orders. As such, issues are addressed relating to linguistics and translation theory, comparative law, theory of law, as well as natural language processing techniques, while some recommendations are provided with the aim to contribute to cross-language retrieval of law

    From Frequency to Meaning: Vector Space Models of Semantics

    Full text link
    Computers understand very little of the meaning of human language. This profoundly limits our ability to give instructions to computers, the ability of computers to explain their actions to us, and the ability of computers to analyse and process text. Vector space models (VSMs) of semantics are beginning to address these limits. This paper surveys the use of VSMs for semantic processing of text. We organize the literature on VSMs according to the structure of the matrix in a VSM. There are currently three broad classes of VSMs, based on term-document, word-context, and pair-pattern matrices, yielding three classes of applications. We survey a broad range of applications in these three categories and we take a detailed look at a specific open source project in each category. Our goal in this survey is to show the breadth of applications of VSMs for semantics, to provide a new perspective on VSMs for those who are already familiar with the area, and to provide pointers into the literature for those who are less familiar with the field
    corecore