1,103 research outputs found

    Visual exploration and retrieval of XML document collections with the generic system X2

    Get PDF
    This article reports on the XML retrieval system X2 which has been developed at the University of Munich over the last five years. In a typical session with X2, the user first browses a structural summary of the XML database in order to select interesting elements and keywords occurring in documents. Using this intermediate result, queries combining structure and textual references are composed semiautomatically. After query evaluation, the full set of answers is presented in a visual and structured way. X2 largely exploits the structure found in documents, queries and answers to enable new interactive visualization and exploration techniques that support mixed IR and database-oriented querying, thus bridging the gap between these three views on the data to be retrieved. Another salient characteristic of X2 which distinguishes it from other visual query systems for XML is that it supports various degrees of detailedness in the presentation of answers, as well as techniques for dynamically reordering and grouping retrieved elements once the complete answer set has been computed

    Reasoning & Querying – State of the Art

    Get PDF
    Various query languages for Web and Semantic Web data, both for practical use and as an area of research in the scientific community, have emerged in recent years. At the same time, the broad adoption of the internet where keyword search is used in many applications, e.g. search engines, has familiarized casual users with using keyword queries to retrieve information on the internet. Unlike this easy-to-use querying, traditional query languages require knowledge of the language itself as well as of the data to be queried. Keyword-based query languages for XML and RDF bridge the gap between the two, aiming at enabling simple querying of semi-structured data, which is relevant e.g. in the context of the emerging Semantic Web. This article presents an overview of the field of keyword querying for XML and RDF

    MA Algorithm to Generate Semantic Web Related Clustered Hierarchy for Keyword Search

    Get PDF
    Keyword search in XML documents is based on the notion of lowest common ancestors in the labeled trees model of XML documents. It has recently gained a lot of research interest in the database community. In this paper we propose the Modified Active (MA) algorithm which is an improvement over the Active clustering algorithm. The algorithm takes into consideration the entity aspect of the nodes to find the level of the node pertaining to a keyword input by the user. A portion of the Bibliography database is used to experimentally evaluate the Modified Active algorithm. Evaluation results show that MA algorithm generates clusters faster than the Active algorithm and this increases the efficiency of the system

    Semantics and result disambiguation for keyword search on tree data

    Get PDF
    Keyword search is a popular technique for searching tree-structured data (e.g., XML, JSON) on the web because it frees the user from learning a complex query language and the structure of the data sources. However, the convenience of keyword search comes with drawbacks. The imprecision of the keyword queries usually results in a very large number of results of which only very few are relevant to the query. Multiple previous approaches have tried to address this problem. Some of them exploit structural and semantic properties of the tree data in order to filter out irrelevant results while others use a scoring function to rank the candidate results. These are not easy tasks though and in both cases, relevant results might be missed and the users might spend a significant amount of time searching for their intended result in a plethora of candidates. Another drawback of keyword search on tree data, also due to the incapacity of keyword queries to precisely express the user intent, is that the query answer may contain different types of meaningful results even though the user is interested in only some of them. Both problems of keyword search on tree data are addressed in this dissertation. First, an original approach for answering keyword queries is proposed. This approach extracts structural patterns of the query matches and reasons with them in order to return meaningful results ranked with respect to their relevance to the query. The proposed semantics performs comparisons between patterns of results by using different types of ho-momorphisms between the patterns. These comparisons are used to organize the patterns into a graph of patterns which is leveraged to determine ranking and filtering semantics. The experimental results show that the approach produces query results of higher quality compared to the previous ones. To address the second problem, an original approach for clustering the keyword search results on tree data is introduced. The clustered output allows the user to focus on a subset of the results, and to save time and effort while looking for the relevant results. The approach performs clustering at different levels of granularity to group similar results together effectively. The similarity of the results and result clusters is decided using relations on structural patterns of the results defined based on homomor-phisms between path patterns. An originality of the clustering approach is that the clusters are ranked at different levels of granularity to quickly guide the user to the relevant result patterns. An efficient stack-based algorithm is presented for generating result patterns and constructing the clustering hierarchy. The extensive experimentation with multiple real datasets show that the algorithm is fast and scalable. It also shows that the clustering methodology allows the users to effectively retrieve their intended results, and outperforms a recent state-of-the-art clustering approach. In order to tackle the second problem from a different aspect, diversifying the results of keyword search is addressed. Diversification aims to provide the users with a ranked list of results which balances the relevance and redundancy of the results. Measures for quantifying the relevance and dissimilarity of result patterns are presented and a heuristic for generating a diverse set of results using these metrics is introduced

    MultimediaN E-Culture demonstrator

    Get PDF
    The main objective of the MultimediaN E-Culture project is to demonstrate how novel semantic-web and presentation technologies can be deployed to provide better indexing and search support within large virtual collections of cultural-heritage resources. The architecture is fully based on open web standards, in particular XML, SVG, RDF/OWL and SPARQL. One basic hypothesis underlying this work is that the use of explicit background knowledge in the form of ontologies/vocabularies/thesauri is in particular useful in information retrieval in knowledge-rich domains
    • …
    corecore