30,922 research outputs found

    Effective semantic-based keyword search over relational databases for knowledge discovery

    Get PDF
    Keyword-based search has been popularized by Internet web search engines such as Google which is the most commonly used search engine to locate the information on the web. On the other hand while traditional database management systems offer powerful query languages such as SQL, they do not provide keyword-based search similar to the one provided by web search engines. The current amount of text data in relational databases is massive and is growing fast. This increases the importance and need for non-technical users to be able to search for such information using simple keyword search just as how they would search for text documents on the web. Keyword search over relational databases (KSRDBs) enables ordinary users to query relational databases by simply submitting keywords without having to know any SQL or having any knowledge of the underlying structure of the data. In this research work our primary focus is to enhance the effectiveness of the keyword search over relational databases using semantic web technologies. We have also addressed some the issues with the effectiveness of the current keyword search over relational databases. In particular we are addressing the followings: We have improved (gained significantly higher precision/recall curve) the existing state-of-the-art ranking functions by incorporating the query keywords\u27 proximity and query keywords\u27 quadgrams of the text attributes with long string into the scoring function. We have adapted a novel approach in making keyword search recommendations based on the text attributes in which the search terms were found without relying on the user\u27s past search criteria. A proof of concept (POC) prototype system called TupleRecommender has been implemented based on this approach. We have designed and implemented a proof of concept (POC) prototype system called database semantic search explorer (DBSemSXplorer) which can answer the traditional keyword search over relational databases in a more effective way with a better presentation of search results. This system is based on semantic web technologies and is equipped with faceted search and inference capability of the Semantic Web to ease the task of knowledge discovery for the end user

    Effective keyword query structuring using NER for XML retrieval

    Get PDF
    Purpose: A more effective way for searching XML database is to use structured queries. However, using query languages to express queries prove to be difficult for most users since this requires learning a query language and knowledge of the underlying data schema. On the other hand, the success of web search engines has made many users to be familiar with keyword search and therefore they prefer to use a keyword search query interface to search XML data. The purpose of this paper is to propose and evaluate XKQSS, a query structuring method that relegates the task of generating structured queries from a user to a search engine while retaining the simple keyword search query interface. Design/methodology/approach: Existing query structuring approaches require users to provide structural hints in their input keyword queries even though their interface is keyword base. Other problems with existing systems include their inability to put keyword query ambiguities into consideration during query structuring and how to select the best generated structure query that best represents a given keyword query. To address these problems, this study allows users to submit a schema independent keyword query, use named Entity Recognition (NER) to categorize query keywords in order to resolve query ambiguities and compute semantic information for a node from its data content. Algorithms were proposed that find user search intentions and convert the intentions into a set of ranked structured queries. Findings: Experiments with Sigmod and IMDB datasets were conducted to evaluate the effectiveness of the method. The experimental result shows that the XKQSS is about 20% more effective than XReal in terms of return nodes identification, a state-of-art systems for XML retrieval. Originality/value: Existing systems do not take keyword query ambiguities into account. XKSS consists of two guidelines based on NER that help to resolve these ambiguities before converting the submitted query. It also include a ranking function computes a score for each generated query by using both semantic information and data statistic as opposed to data statistic only approach used by the existing approaches

    Enabling Keyword Search on Linked Data Repositories: An Ontology-Based Approach

    Get PDF
    The Web is experiencing a continuous change that is leading to the realization of the Semantic Web. Initiatives such as Linked Data have made a huge amount of structured information publicly available, encouraging the rest of the Internet community to tag their resources with it. Unfortunately, the amount of interlinked domains and information is so big that handling it e¿ciently has become really di¿cult for ¿nal users. Thus, we have to provide them with tools to search the needed resources in an easy way. In this paper, we propose an approach to provide users with di¿erent domain views on a general data repository, enabling them to perform both keyword and re¿nement searches. Our system exploits the knowledge stored in ontologies to 1) perform e¿cient keyword searches over a speci¿ed domain, and 2) re¿ne the user’s domain searches. In this way, we enable the de¿nition of di¿erent semantic views on Linked Data datasets without having to change the original semantics. We present a prototype of our approach that focuses on the case of DBpedia, which provides a semantic way to access to Wikipedia

    Towards a learning analytics approach for supporting discovery and reuse of OER: an approach based on Social Networks Analysis and Linked Open Data

    Get PDF
    The OER movement poses challenges inherent to discovering and reuse digital educational materials from highly heterogeneous and distributed digital repositories. Search engines on today?s Web of documents are based on keyword queries. Search engines don?t provide a sufficiently comprehensive solution to answer a query that permits personalization of open educational materials. To find OER on the Web today, users must first be well informed of which OER repositories potentially contain the data they want and what data model describes these datasets, before using this information to create structured queries. Learning analytics requires not only to retrieve the useful information and knowledge about educational resources, learning processes and relations among learning agents, but also to transform the data gathered in actionable e interoperable information. Linked Data is considered as one of the most effective alternatives for creating global shared information spaces, it has become an interesting approach for discovering and enriching open educational resources data, as well as achieving semantic interoperability and re-use between multiple OER repositories. In this work, an approach based on Semantic Web technologies, the Linked Data guidelines, and Social Network Analysis methods are proposed as a fundamental way to describing, analyzing and visualizing knowledge sharing on OER initiatives

    A Hybrid Approach to Finding Relevant Social Media Content for Complex Domain Specific Information Needs

    Get PDF
    While contemporary semantic search systems offer to improve classical keyword-based search, they are not always adequate for complex domain specific information needs. The domain of prescription drug abuse, for example, requires knowledge of both ontological concepts and 'intelligible constructs' not typically modeled in ontologies. These intelligible constructs convey essential information that include notions of intensity, frequency, interval, dosage and sentiments, which could be important to the holistic needs of the information seeker. We present a hybrid approach to domain specific information retrieval (or knowledge-aware search) that integrates ontology-driven query interpretation with synonym-based query expansion and domain specific rules, to facilitate search in social media. Our framework is based on a context-free grammar (CFG) that defines the query language of constructs interpretable by the search system. The grammar provides two levels of semantic interpretation: 1) a top-level CFG that facilitates retrieval of diverse textual patterns, which belong to broad templates and 2) a low-level CFG that enables interpretation of certain specific expressions that belong to such patterns. These low-level expressions occur as concepts from four different categories of data: 1) ontological concepts, 2) concepts in lexicons (such as emotions and sentiments), 3) concepts in lexicons with only partial ontology representation, called lexico-ontology concepts (such as side effects and routes of administration (ROA)), and 4) domain specific expressions (such as date, time, interval, frequency and dosage) derived solely through rules. Our approach is embodied in a novel Semantic Web platform called PREDOSE developed for prescription drug abuse epidemiology. Keywords: Knowledge-Aware Search, Ontology, Semantic Search, Background Knowledge, Context-Free GrammarComment: Accepted for publication: Journal of Web Semantics, Elsevie

    Schema-aware keyword search on linked data

    Get PDF
    Keyword search is a popular technique for querying the ever growing repositories of RDF graph data on the Web. This is due to the fact that the users do not need to master complex query languages (e.g., SQL, SPARQL) and they do not need to know the underlying structure of the data on the Web to compose their queries. Keyword search is simple and flexible. However, it is at the same time ambiguous since a keyword query can be interpreted in different ways. This feature of keyword search poses at least two challenges: (a) identifying relevant results among a multitude of candidate results, and (b) dealing with the performance scalability issue of the query evaluation algorithms. In the literature, multiple schema-unaware approaches are proposed to cope with the above challenges. Some of them identify as relevant results only those candidate results which maintain the keyword instances in close proximity. Other approaches filter out irrelevant results using their structural characteristics or rank and top-k process the retrieved results based on statistical information about the data. In any case, these approaches cannot disambiguate the query to identify the intent of the user and they cannot scale satisfactorily when the size of the data and the number of the query keywords grow. In recent years, different approaches tried to exploit the schema (structural summary) of the RDF (Resource Description Framework) data graph to address the problems above. In this context, an original hierarchical clustering technique is introduced in this dissertation. This approach clusters the results based on a semantic interpretation of the keyword instances and takes advantage of relevance feedback from the user. The clustering hierarchy uses pattern graphs which are structured queries and clustering together result graphs with the same structure. Pattern graphs represent possible interpretations for the keyword query. By navigating though the hierarchy the user can select the pattern graph which is relevant to her intent. Nevertheless, structural summaries are approximate representations of the data and, therefore, might return empty answers or miss results which are relevant to the user intent. To address this issue, a novel approach is presented which combines the use of the structural summary and the user feedback with a relaxation technique for pattern graphs to extract additional results potentially of interest to the user. Query caching and multi-query optimization techniques are leveraged for the efficient evaluation of relaxed pattern graphs. Although the approaches which consider the structural summary of the data graph are promising, they require interaction with the user. It is claimed in this dissertation that without additional information from the user, it is not possible to produce results of high quality from keyword search on RDF data with the existing techniques. In this regard, an original keyword query language on RDF data is introduced which allows the user to convey his intention flexibly and effortlessly by specifying cohesive keyword groups. A cohesive group of keywords in a query indicates that its keywords should form a cohesive unit in the query results. It is experimentally demonstrated that cohesive keyword queries improve the result quality effectively and prune the search space of the pattern graphs efficiently compared to traditional keyword queries. Most importantly, these benefits are achieved while retaining the simplicity and the convenience of traditional keyword search. The last issue addressed in this dissertation is the diversification problem for keyword search on RDF data. The goal of diversification is to trade off relevance and diversity in the results set of a keyword query in order to minimize the dissatisfaction of the average user. Novel metrics are developed for assessing relevance and diversity along with techniques for the generation of a relevant and diversified set of query interpretations for a keyword query on an RDF data graph. Experimental results show the effectiveness of the metrics and the efficiency of the approach

    Semantic keyword search for expert witness discovery

    No full text
    In the last few years, there has been an increase in the amount of information stored in semantically enriched knowledge bases, represented in RDF format. These improve the accuracy of search results when the queries are semantically formal. However framing such queries is inappropriate for inexperience users because they require specialist knowledge of ontology and syntax. In this paper, we explore an approach that automates the process of converting a conventional keyword search into a semantically formal query in order to find an expert on a semantically enriched knowledge base. A case study on expert witness discovery for the resolution of a legal dispute is chosen as the domain of interest and a system named SKengine is implemented to illustrate the approach. As well as providing an easy user interface, our experiment shows that SKengine can retrieve expert witness information with higher precision and higher recall, compared with the other system, with the same interface, implemented by a vector model approach

    Semantic keyword search for expert witness discovery

    Get PDF
    In the last few years, there has been an increase in the amount of information stored in semantically enriched knowledge bases, represented in RDF format. These improve the accuracy of search results when the queries are semantically formal. However framing such queries is inappropriate for inexperience users because they require specialist knowledge of ontology and syntax. In this paper, we explore an approach that automates the process of converting a conventional keyword search into a semantically formal query in order to find an expert on a semantically enriched knowledge base. A case study on expert witness discovery for the resolution of a legal dispute is chosen as the domain of interest and a system named SKengine is implemented to illustrate the approach. As well as providing an easy user interface, our experiment shows that SKengine can retrieve expert witness information with higher precision and higher recall, compared with the other system, with the same interface, implemented by a vector model approach
    corecore