7,869 research outputs found
Natural language processing
Beginning with the basic issues of NLP, this chapter aims to chart the major research activities in this area since the last ARIST Chapter in 1996 (Haas, 1996), including: (i) natural language text processing systems - text summarization, information extraction, information retrieval, etc., including domain-specific applications; (ii) natural language interfaces; (iii) NLP in the context of www and digital libraries ; and (iv) evaluation of NLP systems
A Word Sense-Oriented User Interface for Interactive Multilingual Text Retrieval
In this paper we present an interface for supporting a user in an interactive cross-language search process using semantic classes. In order to enable users to access multilingual information, different problems have to be solved: disambiguating and translating the query words, as well as categorizing and presenting the results appropriately. Therefore, we first give a brief introduction to word sense disambiguation, cross-language text retrieval and document categorization and finally describe recent achievements of our research towards an interactive multilingual retrieval system. We focus especially on the problem of browsing and navigation of the different word senses in one source and possibly several target languages. In the last part of the paper, we discuss the developed user interface and its functionalities in more detail
Initial specification of the evaluation tasks "Use cases to bridge validation and benchmarking" PROMISE Deliverable 2.1
Evaluation of multimedia and multilingual information access systems needs to be performed from a usage oriented perspective. This document outlines use cases from the three use case domains of the PROMISE project and gives some initial pointers to how their respective characteristics can be extrapolated to determine and guide evaluation activities, both with respect to benchmarking and to validation of the usage hypotheses. The use cases will be developed further during the course of the evaluation activities and workshops projected to occur in coming CLEF conferences
PLuTO: MT for online patent translation
PLuTO – Patent Language Translation Online – is a partially EU-funded commercialization project which specializes in the automatic retrieval and translation of patent documents. At the core of the PLuTO framework is a machine translation (MT) engine through which web-based translation services are offered. The fully integrated PLuTO architecture includes a translation engine coupling MT with translation memories (TM), and a patent search and retrieval engine. In this paper, we first describe the motivating factors behind the provision of such a service. Following this, we give an overview of the PLuTO framework as a whole, with particular emphasis on the MT components, and provide a real world use case scenario in which PLuTO MT services are exploited
Multiple Retrieval Models and Regression Models for Prior Art Search
This paper presents the system called PATATRAS (PATent and Article Tracking,
Retrieval and AnalysiS) realized for the IP track of CLEF 2009. Our approach
presents three main characteristics: 1. The usage of multiple retrieval models
(KL, Okapi) and term index definitions (lemma, phrase, concept) for the three
languages considered in the present track (English, French, German) producing
ten different sets of ranked results. 2. The merging of the different results
based on multiple regression models using an additional validation set created
from the patent collection. 3. The exploitation of patent metadata and of the
citation structures for creating restricted initial working sets of patents and
for producing a final re-ranking regression model. As we exploit specific
metadata of the patent documents and the citation relations only at the
creation of initial working sets and during the final post ranking step, our
architecture remains generic and easy to extend
Designing multilingual information access to Tate Online
The Tate is Britain's premier national art gallery and includes content from internationally-renowned artists such as Constable and Turner. Like most cultural heritage institutions, the Tate provides online access to a large amount of digitized material. Given the international importance of content provided by the Tate Gallery, multilingual access would seem an ideal way in which to increase accessibility to the collections, and thereby increase traffic to the website. In this short paper we propose using the Tate as a case study for cross-language research and evaluation, determining the gallery’s requirements and the multilingual needs of their end-users
Accessing Legal Information Across Boundaries: A New Challenge
In the actual multilingual and multicultural environment there is a significant need, in the academic world, in the legal profession, in business settings as well as in the context of public administration services to citizens, of common understanding and exchange of legal concepts of the various legal systems. At the same time, there is a strong pressure for the reservation of their basic sense and value. Both requirements are quite difficult to meet, and they are complicated by the complexity of legal language and by the variety of modalities used to express law within the various legal systems. Unlike a number of technical and scientific disciplines where a fair correspondence exists between concepts across languages, serious difficulties arise in interpreting law across countries and languages. This is largely due to the system-bound nature of legal terminology. This paper focuses on crosslanguage retrieval systems\u27 ability to facilitate access to legal information across different languages and legal orders. As such, issues are addressed relating to linguistics and translation theory, comparative law, theory of law, as well as natural language processing techniques, while some recommendations are provided with the aim to contribute to cross-language retrieval of law
CLEF 2005: Ad Hoc track overview
We describe the objectives and organization of the CLEF 2005 ad hoc track and discuss the main characteristics of the tasks offered to test monolingual, bilingual and multilingual textual document retrieval. The performance achieved for each task is presented and a preliminary analysis of results is given. The paper focuses in particular on the multilingual tasks which reused the test collection created in CLEF 2003 in an attempt to see if an improvement in system performance over time could be measured, and also to examine the multilingual results merging problem
- …