45 research outputs found
Hybrid Search: Effectively Combining Keywords and Semantic Searches
This paper describes hybrid search, a search method supporting both document and knowledge retrieval via the flexible combination of ontologybased search and keyword-based matching. Hybrid search smoothly copes with
lack of semantic coverage of document content, which is one of the main limitations of current semantic search methods. In this paper we define hybrid search formally, discuss its compatibility with the current semantic trends and present a reference implementation: K-Search. We then show how the method outperforms both keyword-based search and pure semantic search in terms of precision and recall in a set of experiments performed on a collection of about 18.000 technical documents. Experiments carried out with professional users show that users understand the paradigm and consider it very powerful and reliable. K-Search has been ported to two applications released at Rolls-Royce
plc for searching technical documentation about jet engines
Recommended from our members
Using TREC for cross-comparison between classic IR and ontology-based search models at a Web scale
The construction of standard datasets and benchmarks to evaluate ontology-based search approaches and to compare then against baseline IR models is a major open problem in the semantic technologies community. In this paper we propose a novel evaluation benchmark for ontology-based IR models based on an adaptation of the well-known Cranfield paradigm (Cleverdon, 1967) traditionally used by the IR community. The proposed benchmark comprises: 1) a text document collection, 2) a set of queries and their corresponding document relevance judgments and 3) a set of ontologies and Knowledge Bases covering the query topics. The document collection and the set of queries and judgments are taken from one of the most widely used datasets in the IR community, the TREC Web track. As a use case example we apply the proposed benchmark to compare a real ontology-based search model (Fernandez, et al., 2008) against the best IR systems of TREC 9 and TREC 2001 competitions. A deep analysis of the strengths and weaknesses of this benchmark and a discussion of how it can be used to evaluate other ontology-based search systems is also included at the end of the paper
A negation query engine for complex query transformations
Natural language interfaces to ontologies allow users to query the system using natural language queries. These systems take natural language query as input and transform it to formal query language equivalent to retrieve the desired information from ontologies. The existing natural language interfaces to ontologies offer support for handling negation queries; however, they offer limited support for dealing with them. This paper proposes a negation query handling engine which can handle relatively complex natural language queries than the existing systems. The proposed engine effectively understands the intent of
the user query on the basis of a sophisticated algorithm, which is governed by a set of techniques and transformation rules. The proposed engine was evaluated using the Mooney data set and AquaLog dataset, and it manifested encouraging results
Semantic Clustering of Search Engine Results
This paper presents a novel approach for search engine results clustering that relies on the semantics of the retrieved documents rather than the terms in those documents. The proposed approach takes into consideration both lexical and semantics similarities among documents and applies activation spreading technique in order to generate semantically meaningful clusters. This approach allows documents that are semantically similar to be clustered together rather than clustering documents based on similar terms. A prototype is implemented and several experiments are conducted to test the prospered solution. The result of the experiment confirmed that the proposed solution achieves remarkable results in terms of precision
Une approche pour la recherche sémantique de l'information dans les documents semi-structurés hétérogènes
National audienceCe papier présente SHIRI-Querying, une approche pour la recherche sémantique de l'information dans les documents semi-structurés. Nous proposons une solution pour pallier l'incomplétude et l'imprécision des annotations au moment de l'interrogation. Cette solution repose sur deux types de reformulations élémentaires qui exploitent la notion d'agrégation et la structure des documents. Nous présentons l'algorithme DREQ qui combine ces transformations élémentaires pour construire des reformulations ordonnées de la requête utilisateur. L'étude de notre approche sur deux corpus réels montre que les reformulations augmentent considérablement le rappel et que la précision est meilleure pour les premières réponses retournées
Vectorised Spreading Activation algorithm for centrality measurement
Spreading Activation is a family of graph-based algorithms widely used in areas such as information retrieval, epidemic models, and recommender systems. In this paper we introduce a novel Spreading Activation (SA) method that we call Vectorised Spreading Activation (VSA). VSA algorithms, like “traditional” SA algorithms, iteratively propagate the activation from the initially activated set of nodes to the other nodes in a network through outward links. The level of the node’s activation could be used as a centrality measurement in accordance with dynamic model-based view of centrality that focuses on the outcomes for nodes in a network where something is fl owing from node to node across the edges. Representing the activation by vectors allows the use of the information about various dimensionalities of the fl ow and the dynamic of the fl ow. In this capacity, VSA algorithms can model multitude of complex multidimensional network fl ows. We present the results of numerical simulations on small synthetic social networks and multi dimensional network models of folksonomies which show that the results of VSA propagation are more sensitive to the positions of the initial seed and to the community structure of the network than the results produced by traditional SA algorithms. We tentatively conclude that the VSA methods could be instrumental to develop scalable and computationally effi cient algorithms which could achieve synergy between computation of centrality indexes with detection of community structures in networks. Based on our preliminary results and on improvements made over previous studies, we foresee advances and applications in the current state of the art of this family of algorithms and their applications to centrality measurement