Search CORE

4 research outputs found

Combining Named Entities with WordNet and Using Query-Oriented Spreading Activation for Semantic Text Search

Author: Cao Tru H.
Le Tuan M. V.
Ngo Vuong M.
Publication venue
Publication date: 20/07/2018
Field of study

Purely keyword-based text search is not satisfactory because named entities and WordNet words are also important elements to define the content of a document or a query in which they occur. Named entities have ontological features, namely, their aliases, classes, and identifiers. Words in WordNet also have ontological features, namely, their synonyms, hypernyms, hyponyms, and senses. Those features of concepts may be hidden from their textual appearance. Besides, there are related concepts that do not appear in a query, but can bring out the meaning of the query if they are added. We propose an ontology-based generalized Vector Space Model to semantic text search. It exploits ontological features of named entities and WordNet words, and develops a query-oriented spreading activation algorithm to expand queries. In addition, it combines and utilizes advantages of different ontologies for semantic annotation and searching. Experiments on a benchmark dataset show that, in terms of the MAP measure, our model is 42.5% better than the purely keyword-based model, and 32.3% and 15.9% respectively better than the ones using only WordNet or named entities. Keywords: semantic search, spreading activation, ontology, named entity, WordNet.Comment: 6 papes, Accepted by RIVF. arXiv admin note: substantial text overlap with arXiv:1807.05579; text overlap with arXiv:1807.0557

arXiv.org e-Print Archive

Semantic Search by Latent Ontological Features

Author: Cao Tru H.
Ngo Vuong M.
Publication venue
Publication date: 15/07/2018
Field of study

Both named entities and keywords are important in defining the content of a text in which they occur. In particular, people often use named entities in information search. However, named entities have ontological features, namely, their aliases, classes, and identifiers, which are hidden from their textual appearance. We propose ontology-based extensions of the traditional Vector Space Model that explore different combinations of those latent ontological features with keywords for text retrieval. Our experiments on benchmark datasets show better search quality of the proposed models as compared to the purely keyword-based model, and their advantages for both text retrieval and representation of documents and queries.Comment: 17 pages, Accept by New Generation Computing (2012

arXiv.org e-Print Archive

A Generalized Vector Space Model for Ontology-Based Information Retrieval

Author: Cao Tru H.
Ngo Vuong M.
Publication venue
Publication date: 20/07/2018
Field of study

Named entities (NE) are objects that are referred to by names such as people, organizations and locations. Named entities and keywords are important to the meaning of a document. We propose a generalized vector space model that combines named entities and keywords. In the model, we take into account different ontological features of named entities, namely, aliases, classes and identifiers. Moreover, we use entity classes to represent the latent information of interrogative words in Wh-queries, which are ignored in traditional keyword-based searching. We have implemented and tested the proposed model on a TREC dataset, as presented and discussed in the paper.Comment: 5 pages, in Vietnamese. information retrieval, vector space model, ontology, named entity, keyword. Accepted by Vietnamese Journal on Information Technologies and Communication

arXiv.org e-Print Archive

Semantic Search using Spreading Activation based on Ontology

Author: Vuong Ngo Minh
Publication venue
Publication date: 09/05/2019
Field of study

Currently, the text document retrieval systems have many challenges in exploring the semantics of queries and documents. Each query implies information which does not appear in the query but the documents related with the information are also expected by user. The disadvantage of the previous spreading activation algorithms could be many irrelevant concepts added to the query. In this paper, a proposed novel algorithm is only activate and add to the query named entities which are related with original entities in the query and explicit relations in the query.Comment: 21 pages, in Vietnames

arXiv.org e-Print Archive