Search CORE

30,673 research outputs found

Dictionaries merger for text expansion in question answering

Author: Jacquemin Bernard
Publication venue
Publication date: 01/01/2004
Field of study

This paper presents an original way to add new data in a reference dictionary from several other lexical resources, without loosing any consistence. This operation is carried in order to get lexical information classified by the sense of the entry. This classification makes it possible to enrich utterances (in QA: the queries) following the meaning, and to reduce noise. An analysis of the experienced problems shows the interest of this method, and insists on the points that have to be tackled.Comment: 4 p

arXiv.org e-Print Archive

CiteSeerX

Crossref

Multilingual Schema Matching for Wikipedia Infoboxes

Author: Freire Juliana
Moreira Viviane
Nguyen Hoa
Nguyen Huong
Nguyen Thanh
Publication venue
Publication date: 01/01/2011
Field of study

Recent research has taken advantage of Wikipedia's multilingualism as a resource for cross-language information retrieval and machine translation, as well as proposed techniques for enriching its cross-language structure. The availability of documents in multiple languages also opens up new opportunities for querying structured Wikipedia content, and in particular, to enable answers that straddle different languages. As a step towards supporting such queries, in this paper, we propose a method for identifying mappings between attributes from infoboxes that come from pages in different languages. Our approach finds mappings in a completely automated fashion. Because it does not require training data, it is scalable: not only can it be used to find mappings between many language pairs, but it is also effective for languages that are under-represented and lack sufficient training samples. Another important benefit of our approach is that it does not depend on syntactic similarity between attribute names, and thus, it can be applied to language pairs that have distinct morphologies. We have performed an extensive experimental evaluation using a corpus consisting of pages in Portuguese, Vietnamese, and English. The results show that not only does our approach obtain high precision and recall, but it also outperforms state-of-the-art techniques. We also present a case study which demonstrates that the multilingual mappings we derive lead to substantial improvements in answer quality and coverage for structured queries over Wikipedia content.Comment: VLDB201

arXiv.org e-Print Archive

CiteSeerX

Web 2.0, language resources and standards to automatically build a multilingual named entity lexicon

Author: Ferrández Sergio
Monachini Monica
Muñoz Rafael
Toral Antonio
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 17/06/2011
Field of study

This paper proposes to advance in the current state-of-the-art of automatic Language Resource (LR) building by taking into consideration three elements: (i) the knowledge available in existing LRs, (ii) the vast amount of information available from the collaborative paradigm that has emerged from the Web 2.0 and (iii) the use of standards to improve interoperability. We present a case study in which a set of LRs for diﬀerent languages (WordNet for English and Spanish and Parole-Simple-Clips for Italian) are extended with Named Entities (NE) by exploiting Wikipedia and the aforementioned LRs. The practical result is a multilingual NE lexicon connected to these LRs and to two ontologies: SUMO and SIMPLE. Furthermore, the paper addresses an important problem which aﬀects the Computational Linguistics area in the present, interoperability, by making use of the ISO LMF standard to encode this lexicon. The diﬀerent steps of the procedure (mapping, disambiguation, extraction, NE identiﬁcation and postprocessing) are comprehensively explained and evaluated. The resulting resource contains 974,567, 137,583 and 125,806 NEs for English, Spanish and Italian respectively. Finally, in order to check the usefulness of the constructed resource, we apply it into a state-of-the-art Question Answering system and evaluate its impact; the NE lexicon improves the system’s accuracy by 28.1%. Compared to previous approaches to build NE repositories, the current proposal represents a step forward in terms of automation, language independence, amount of NEs acquired and richness of the information represented

DCU Online Research Access Service

Semantic Question Answering System over Linked Data using Relational Patterns

Author: Akimaliev Marlen
Dogdu Erdogan
Hakimov Sherzod
Tunc Hakan
Publication venue
Publication date: 01/01/2013
Field of study

Hakimov S, Tunc H, Akimaliev M, Dogdu E. Semantic Question Answering System over Linked Data using Relational Patterns. In: EDBT/ICDT LWDM 2013. 2013.Question answering is the task of answering questions in naturallanguage. Linked Data project and Semantic Web communitymade it possible for us to query structured knowledge bases likeDBpedia and YAGO. Only expert users, however, with theknowledge of RDF and ontology definitions can build correctSPARQL queries for querying knowledge bases formally. In thispaper, we present a method for mapping natural languagequestions to ontology-based structured queries to retrieve directanswers from open knowledge bases (linked data). Our tool isbased on translating natural language questions into RDF triplepatterns using the dependency tree of the question text. Inaddition, our method uses relational patterns extracted from theWeb. We tested our tool using questions from QALD-2, QuestionAnswering over Linked Data challenge track and found promisingpreliminary results

Publications at Bielefeld University