Search CORE

524 research outputs found

The European Thesaurus on International Relations and Area Studies - a multilingual resource for indexing, retrieval, and translation

Author: Huckstorf Axel
Kluck Michael
Publication venue: 'Museum National d''Histoire Naturelle, Paris, France'
Publication date: 11/05/2011
Field of study

The multilingual European Thesaurus on International Relations and Area Studies (European Thesaurus) is a special subject thesaurus for the field of international affairs. It is intended for use in libraries and documentation centres of academic institutions and international organizations. The European Thesaurus was established in a collaborative project involving a number of leading European research institutes on international politics. It integrates the controlled terminologies of several existing thesauri. The European Thesaurus comprises about 8,200 terms and proper names from the 24 subject areas covered by the thesaurus. Because of its multilinguality, the European Thesaurus can not only be used for indexing, retrieval and terminological reference, but serves also as a translation tool for the languages represented. The establishment of cross-concordances to related thesauri extends the range of application of the European Thesaurus even further. They enable the treatment of semantic heterogeneity within subject gateways. The European Thesaurus is available both in a seven-lingual printversion as well as in an eight-lingual online-version. To reflect the changes in terminology the European Thesaurus is regularly being amended and modified. Further languages are going to be included

Terminology Retrieval: Towards a Synergy between Thesaurus and Free Text Searching

Author: Anselmo Peñas
Felisa Verdejo
Julio Gonzalo
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2002
Field of study

Abstract. Multilingual Information Retrieval usually forces a choice between free text indexing or indexing by means of multilingual thesaurus. However, since they share the same objectives, synergy between both approaches is possible. This paper shows a retrieval framework that make use of terminological information in free-text indexing. The Automatic Terminology Extraction task, which is used for thesauri construction, shifts to a searching of terminology and becomes an information retrieval task: Terminology Retrieval. Terminology Retrieval, then, allows cross-language information retrieval through the browsing of morpho-syntactic, semantic and translingual variations of the query. Although terminology retrieval doesn’t make use of them, controlled vocabularies become an appropriate framework for terminology retrieval evaluation.

CiteSeerX

Tools for Terminology Processing

Author: Daille Béatrice
Enguehard Chantal
Morin Emmanuel
Publication venue: Tata McGraw-Hill
Publication date: 01/06/2002
Field of study

International audienceAutomatic terminology processing appeared 10 years ago when electronic corpora became widely available. Such processing may be statistically or linguistically based and produces terminology resources that can be used in a number of applications : indexing, information retrieval, technology watch, etc. We present the tools that have been developed in the IRIN Institute. They all take as input texts (or collection of texts) and reflect different states of terminology processing: term acquisition, term recognition and term structuring

Could we automatically reproduce semantic relations of an information retrieval thesaurus?

Author: Panchenko A.
Publication venue: Издательско-полиграфический центр Воронежского государственного университета
Publication date: 01/01/2010
Field of study

A well constructed thesaurus is recognized as a valuable source of semantic information for various applications, especially for Information Retrieval. The main hindrances to using thesaurus-oriented approaches are the high complexity and cost of manual thesauri creation. This paper addresses the problem of automatic thesaurus construction, namely we study the quality of automatically extracted semantic relations as compared with the semantic relations of a manually crafted thesaurus. The vector-space model based on syntactic contexts was used to reproduce relations between the terms of a manually constructed thesaurus. We propose a simple algorithm for representing both single word and multiword terms in the distributional space of syntactic contexts. Furthermore, we propose a method for evaluation quality of the extracted relations. Our experiments show significant difference between the automatically and manually constructed relations: while many of the automatically generated relations are relevant, just a small part of them could be found in the original thesaurus

The Simple Knowledge Organization System (SKOS): a situation report for the HIVE Project

Author: Bueno-de-la-Fuente Gema
Publication venue
Publication date: 01/01/2008
Field of study

HIVE (Helping Interdisciplinary Vocabularies Engineering) es un proyecto financiado por el IMLS (Institute of Museums and Library Services), e indirectamente, en Dryad, ambos proyectos en colaboración del Metadata Research Center y el National Evolutionary Synthesis Center (NESCent) in Durham, North Carolina. Con el desarrollo de HIVE se pretende resolver esta problemática mediante una propuesta de generación automática de metadatos que permita la integración dinámica de vocabularios controlados específicos. Para asistir la integración de vocabularios se seleccionó SKOS (Simple Knowledge Organisation System), un estándar del World Wide Web Consortium (W3C) para la representación de sistemas de organización del conocimiento o vocabularios, como tesauros, esquemas de clasificación, sistemas de encabezamiento de materias y taxonomías, en el marco de la Web Semántica.El presente informe realiza un análisis exhaustivo de la situación en cuanto a la aplicación de SKOS. El estudio incluye una detallada revisión de literatura científica y recursos web sobre el modelo, una selección de los proyectos, iniciativas, herramientas, grupos de investigación claves y cualquier otro tipo de información que pudiera ser de relevancia para el logro de los objetivos del proyecto HIVE. Asimismo, se analiza la importancia de SKOS para el logro de la interoperabilidad semántica y se elaboran un conjunto de recomendaciones para los miembros del proyecto HIVE

Universidad Carlos III de Madrid e-Archivo

Design of a Controlled Language for Critical Infrastructures Protection

Author: CANTARELLA SIMONA
FERIGATO Carlo
OWUSU EVANS BOATENG
Publication venue: European Language Resources Association
Publication date: 28/03/2012
Field of study

We describe a project for the construction of controlled language for critical infrastructures protection (CIP). This project originates from the need to coordinate and categorize the communications on CIP at the European level. These communications can be physically represented by official documents, reports on incidents, informal communications and plain e-mail. We explore the application of traditional library science tools for the construction of controlled languages in order to achieve our goal. Our starting point is an analogous work done during the sixties in the field of nuclear science known as the Euratom Thesaurus.JRC.G.6-Security technology assessmen

JRC Publications Repository

Accessing Legal Information Across Boundaries: A New Challenge

Author: Peruginelli Ginevra
Publication venue: Scholarship@Cornell Law: A Digital Repository
Publication date: 01/08/2010
Field of study

In the actual multilingual and multicultural environment there is a significant need, in the academic world, in the legal profession, in business settings as well as in the context of public administration services to citizens, of common understanding and exchange of legal concepts of the various legal systems. At the same time, there is a strong pressure for the reservation of their basic sense and value. Both requirements are quite difficult to meet, and they are complicated by the complexity of legal language and by the variety of modalities used to express law within the various legal systems. Unlike a number of technical and scientific disciplines where a fair correspondence exists between concepts across languages, serious difficulties arise in interpreting law across countries and languages. This is largely due to the system-bound nature of legal terminology. This paper focuses on crosslanguage retrieval systems\u27 ability to facilitate access to legal information across different languages and legal orders. As such, issues are addressed relating to linguistics and translation theory, comparative law, theory of law, as well as natural language processing techniques, while some recommendations are provided with the aim to contribute to cross-language retrieval of law

From Frequency to Meaning: Vector Space Models of Semantics

Author: Pantel Patrick
Turney Peter D.
Publication venue: 'AI Access Foundation'
Publication date: 01/01/2010
Field of study

Computers understand very little of the meaning of human language. This profoundly limits our ability to give instructions to computers, the ability of computers to explain their actions to us, and the ability of computers to analyse and process text. Vector space models (VSMs) of semantics are beginning to address these limits. This paper surveys the use of VSMs for semantic processing of text. We organize the literature on VSMs according to the structure of the matrix in a VSM. There are currently three broad classes of VSMs, based on term-document, word-context, and pair-pattern matrices, yielding three classes of applications. We survey a broad range of applications in these three categories and we take a detailed look at a specific open source project in each category. Our goal in this survey is to show the breadth of applications of VSMs for semantics, to provide a new perspective on VSMs for those who are already familiar with the area, and to provide pointers into the literature for those who are less familiar with the field

arXiv.org e-Print Archive

CiteSeerX