3,135 research outputs found
MAG: A Multilingual, Knowledge-base Agnostic and Deterministic Entity Linking Approach
Entity linking has recently been the subject of a significant body of
research. Currently, the best performing approaches rely on trained
mono-lingual models. Porting these approaches to other languages is
consequently a difficult endeavor as it requires corresponding training data
and retraining of the models. We address this drawback by presenting a novel
multilingual, knowledge-based agnostic and deterministic approach to entity
linking, dubbed MAG. MAG is based on a combination of context-based retrieval
on structured knowledge bases and graph algorithms. We evaluate MAG on 23 data
sets and in 7 languages. Our results show that the best approach trained on
English datasets (PBOH) achieves a micro F-measure that is up to 4 times worse
on datasets in other languages. MAG, on the other hand, achieves
state-of-the-art performance on English datasets and reaches a micro F-measure
that is up to 0.6 higher than that of PBOH on non-English languages.Comment: Accepted in K-CAP 2017: Knowledge Capture Conferenc
Multiple Retrieval Models and Regression Models for Prior Art Search
This paper presents the system called PATATRAS (PATent and Article Tracking,
Retrieval and AnalysiS) realized for the IP track of CLEF 2009. Our approach
presents three main characteristics: 1. The usage of multiple retrieval models
(KL, Okapi) and term index definitions (lemma, phrase, concept) for the three
languages considered in the present track (English, French, German) producing
ten different sets of ranked results. 2. The merging of the different results
based on multiple regression models using an additional validation set created
from the patent collection. 3. The exploitation of patent metadata and of the
citation structures for creating restricted initial working sets of patents and
for producing a final re-ranking regression model. As we exploit specific
metadata of the patent documents and the citation relations only at the
creation of initial working sets and during the final post ranking step, our
architecture remains generic and easy to extend
Achieving Secure and Efficient Cloud Search Services: Cross-Lingual Multi-Keyword Rank Search over Encrypted Cloud Data
Multi-user multi-keyword ranked search scheme in arbitrary language is a
novel multi-keyword rank searchable encryption (MRSE) framework based on
Paillier Cryptosystem with Threshold Decryption (PCTD). Compared to previous
MRSE schemes constructed based on the k-nearest neighbor searcha-ble encryption
(KNN-SE) algorithm, it can mitigate some draw-backs and achieve better
performance in terms of functionality and efficiency. Additionally, it does not
require a predefined keyword set and support keywords in arbitrary languages.
However, due to the pattern of exact matching of keywords in the new MRSE
scheme, multilingual search is limited to each language and cannot be searched
across languages. In this pa-per, we propose a cross-lingual multi-keyword rank
search (CLRSE) scheme which eliminates the barrier of languages and achieves
semantic extension with using the Open Multilingual Wordnet. Our CLRSE scheme
also realizes intelligent and per-sonalized search through flexible keyword and
language prefer-ence settings. We evaluate the performance of our scheme in
terms of security, functionality, precision and efficiency, via extensive
experiments
Extending, trimming and fusing WordNet for technical documents
This paper describes a tool for the automatic
extension and trimming of a multilingual
WordNet database for cross-lingual retrieval
and multilingual ontology building in
intranets and domain-specific document
collections. Hierarchies, built from
automatically extracted terms and combined
with the WordNet relations, are trimmed
with a disambiguation method based on the
document salience of the words in the
glosses. The disambiguation is tested in a
cross-lingual retrieval task, showing
considerable improvement (7%-11%). The
condensed hierarchies can be used as
browse-interfaces to the documents
complementary to retrieval
Archaeology in the Digital Age: From Paper to Databases
Research units in archaeology often manage large and precious archives
containing various documents, including reports on fieldwork, scholarly studies
and reference books. These archives are of course invaluable, recording decades
of work, but are generally hard to consult and access. In this context,
digitizing full text documents is not enough: information must be formalized,
structured and easy to access thanks to friendly user interfaces.Comment: Digital Humanities 2015, Jun 2015, Sydney, Australia. 2015,
Proceedings of the conference "Digital Humanities 2015
- …