18,762 research outputs found
Natural language processing
Beginning with the basic issues of NLP, this chapter aims to chart the major research activities in this area since the last ARIST Chapter in 1996 (Haas, 1996), including: (i) natural language text processing systems - text summarization, information extraction, information retrieval, etc., including domain-specific applications; (ii) natural language interfaces; (iii) NLP in the context of www and digital libraries ; and (iv) evaluation of NLP systems
A Hybrid Web Recommendation System based on the Improved Association Rule Mining Algorithm
As the growing interest of web recommendation systems those are applied to
deliver customized data for their users, we started working on this system.
Generally the recommendation systems are divided into two major categories such
as collaborative recommendation system and content based recommendation system.
In case of collaborative recommen-dation systems, these try to seek out users
who share same tastes that of given user as well as recommends the websites
according to the liking given user. Whereas the content based recommendation
systems tries to recommend web sites similar to those web sites the user has
liked. In the recent research we found that the efficient technique based on
asso-ciation rule mining algorithm is proposed in order to solve the problem of
web page recommendation. Major problem of the same is that the web pages are
given equal importance. Here the importance of pages changes according to the
fre-quency of visiting the web page as well as amount of time user spends on
that page. Also recommendation of newly added web pages or the pages those are
not yet visited by users are not included in the recommendation set. To
over-come this problem, we have used the web usage log in the adaptive
association rule based web mining where the asso-ciation rules were applied to
personalization. This algorithm was purely based on the Apriori data mining
algorithm in order to generate the association rules. However this method also
suffers from some unavoidable drawbacks. In this paper we are presenting and
investigating the new approach based on weighted Association Rule Mining
Algorithm and text mining. This is improved algorithm which adds semantic
knowledge to the results, has more efficiency and hence gives better quality
and performances as compared to existing approaches.Comment: 9 pages, 7 figures, 2 table
Transitive probabilistic CLIR models.
Transitive translation could be a useful technique to enlarge the number of supported language pairs for a cross-language information retrieval (CLIR) system in a cost-effective manner. The paper describes several setups for transitive translation based on probabilistic translation models. The transitive CLIR models were evaluated on the CLEF test collection and yielded a retrieval effectiveness\ud
up to 83% of monolingual performance, which is significantly better than a baseline using the synonym operator
PRIME: A System for Multi-lingual Patent Retrieval
Given the growing number of patents filed in multiple countries, users are
interested in retrieving patents across languages. We propose a multi-lingual
patent retrieval system, which translates a user query into the target
language, searches a multilingual database for patents relevant to the query,
and improves the browsing efficiency by way of machine translation and
clustering. Our system also extracts new translations from patent families
consisting of comparable patents, to enhance the translation dictionary
Applying Machine Translation to Two-Stage Cross-Language Information Retrieval
Cross-language information retrieval (CLIR), where queries and documents are
in different languages, needs a translation of queries and/or documents, so as
to standardize both of them into a common representation. For this purpose, the
use of machine translation is an effective approach. However, computational
cost is prohibitive in translating large-scale document collections. To resolve
this problem, we propose a two-stage CLIR method. First, we translate a given
query into the document language, and retrieve a limited number of foreign
documents. Second, we machine translate only those documents into the user
language, and re-rank them based on the translation result. We also show the
effectiveness of our method by way of experiments using Japanese queries and
English technical documents.Comment: 13 pages, 1 Postscript figur
A geo-temporal information extraction service for processing descriptive metadata in digital libraries
In the context of digital map libraries, resources are usually described according to metadata records that define the relevant subject, location, time-span, format and keywords. On what concerns locations and time-spans, metadata records are often incomplete or they provide information in a way that is not machine-understandable (e.g. textual descriptions). This paper presents techniques for extracting geotemporal information from text, using relatively simple text mining methods that leverage on a Web gazetteer service. The idea is to go from human-made geotemporal referencing (i.e. using place and period names in textual expressions) into geo-spatial coordinates and time-spans. A prototype system, implementing the proposed methods, is described in detail. Experimental results demonstrate the efficiency and accuracy of the proposed approaches
- …