343 research outputs found

    Web Queries: From a Web of Data to a Semantic Web?

    Get PDF

    Analyzing Fuzzy Logic Computations with Fuzzy XPath

    Get PDF
    Implemented with a fuzzy logic language by using the FLOPER tool developed in our research group, we have recently designed a fuzzy dialect of the popular XPath language for the flexible manipulation of XML documents. In this paper we focus on the ability of Fuzzy XPath for exploring derivation trees generated by FLOPER once they are exported in XML format, which somehow serves as a debugging/analizing tool for discovering the set of fuzzy computed answers for a given goal, performing depth/breadth-first traversals of its associated derivation tree, finding non fully evaluated branches, etc., thus reinforcing the bi-lateral synergies between Fuzzy XPath and FLOPER

    Intuitionistic fuzzy XML query matching and rewriting

    Get PDF
    With the emergence of XML as a standard for data representation, particularly on the web, the need for intelligent query languages that can operate on XML documents with structural heterogeneity has recently gained a lot of popularity. Traditional Information Retrieval and Database approaches have limitations when dealing with such scenarios. Therefore, fuzzy (flexible) approaches have become the predominant. In this thesis, we propose a new approach for approximate XML query matching and rewriting which aims at achieving soft matching of XML queries with XML data sources following different schemas. Unlike traditional querying approaches, which require exact matching, the proposed approach makes use of Intuitionistic Fuzzy Trees to achieve approximate (soft) query matching. Through this new approach, not only the exact answer of a query, but also approximate answers are retrieved. Furthermore, partial results can be obtained from multiple data sources and merged together to produce a single answer to a query. The proposed approach introduced a new tree similarity measure that considers the minimum and maximum degrees of similarity/inclusion of trees that are based on arc matching. New techniques for soft node and arc matching were presented for matching queries against data sources with highly varied structures. A prototype was developed to test the proposed ideas and it proved the ability to achieve approximate matching for pattern queries with a number of XML schemas and rewrite the original query so that it obtain results from the underlying data sources. This has been achieved through several novel algorithms which were tested and proved efficiency and low CPU/Memory cost even for big number of data sources

    The Structural Multiple and Information Satisfied Mixture of XML

    Get PDF
    Perhaps the order of the most relevant results for the question and return to the most common form of XML query processing. To solve this problem, we first propose an elegant query release framework that supports approximate XML data queries. The solutions that underpin this framework are not forced to strictly conform to the specified query format, but may be based on attributes that cannot be inferred in the original query. However, the current proposals do not take sufficient account of structures, nor do they have the power to combine structures and content neatly to answer relaxation questions. Within our solution we divide nodes into two groups: categorization attribute contracts and statistical attribute nodes. We continue to use a comprehensive set of experience to demonstrate the effectiveness of our proposed approach in terms of accuracy and the restoration of benchmarks. In practical applications, it is often impossible to query XML data because the hierarchical structure of XML documents can be heterogeneous, so any misunderstanding of the document structure can certainly increase the risk of formulating unsatisfactory queries. This is really difficult, especially given the fact that such queries lead to empty solutions, although there are no translation errors. In addition, we propose an evidence-based acyclic graph that generates and regulates the relaxation of the structure and develops an inefficient assessment coefficient to evaluate the relationship of structure similarity. We are therefore developing a new top-to-search approach that can intelligently create promising solutions in a ranking-related order

    A survey on tree matching and XML retrieval

    Get PDF
    International audienceWith the increasing number of available XML documents, numerous approaches for retrieval have been proposed in the literature. They usually use the tree representation of documents and queries to process them, whether in an implicit or explicit way. Although retrieving XML documents can be considered as a tree matching problem between the query tree and the document trees, only a few approaches take advantage of the algorithms and methods proposed by the graph theory. In this paper, we aim at studying the theoretical approaches proposed in the literature for tree matching and at seeing how these approaches have been adapted to XML querying and retrieval, from both an exact and an approximate matching perspective. This study will allow us to highlight theoretical aspects of graph theory that have not been yet explored in XML retrieval

    Grade And Exact In Order Of Textual Substance

    Get PDF
    Ranking and returning the most relevant results for a question is probably the most popular form of XML query processing. To resolve this issue, we first suggest an elegant framework for query relaxation processes to support difficult XML queries. The solutions on which this framework is based are not required, however, to satisfy the precisely defined query syntax, as they can be based on the qualities that can be deduced in the initial query. It does not have the power to elegantly combine structures and content to answer comfortable questions. In our solution, we classify nodes into two groups: categorical nodes and statistical nodes and pattern-based approaches in assessing the similarity relationship of categorical nodes and statistical nodes. We continue to use a comprehensive set of experiences to demonstrate the effectiveness of our proposed approach to the accuracy and recovery of values. Querying XML data often becomes difficult in practical applications because the hierarchical structure of XML documents can be heterogeneous, so any slight misunderstanding of the document structure can certainly increase the risk of unsatisfactory queries. This is very difficult, especially given that such queries produce empty solutions, even if there are no translation errors. In addition, we design a non-periodic evidence-based vector diagram to create and adjust the weakening of the structure and develop an inefficient evaluation parameter to evaluate the similarity relationship on structures. So, we design a new approach to take the highest k that can intelligently create the most promising solutions in a linked order using the ranking scale

    Expression and Efficient Processing of Fuzzy Queries in a Graph Database Context

    Get PDF
    International audienceGraph databases have aroused a large interest in the last years thanks to their large scope of potential applications (e.g. social networks, biomedical networks, data stemming from the web). In a similar way as what has already been proposed in relational databases, defining a language allowing a flexible querying of graph databases may greatly improve usability of data. This paper focuses on the notion of fuzzy graph database and describes a fuzzy query language that makes it possible to handle such database, which may be fuzzy or not, in a flexible way. This language, called FUDGE, can be used to express preference queries on fuzzy graph databases. The preferences concern i) the content of the vertices of the graph and ii) the structure of the graph. The FUDGE language is implemented in a system, called SUGAR, that we present in this article. We also discuss implementation issues of the FUDGE language in SUGAR

    Type-Ahead Search in XML data based on Improved Forward Index Structure: ATASK

    Get PDF
    The keyword based search system is most widely used in many real time applications for getting the required information from huge amount of dataset in quick time. There are many keyword search based systems and methods presented by various authors already, as the time goes, this methods becomes inefficient in different ways. The all previous methods did not work for search XML data in mode of type-ahead search, and hence it is not trivial to extend existing techniques to support fuzzy type-ahead search in XML data. Previous methods are not purely based on XML data and as XML data is consisting of parent and child nodes, it is complex to understand such format to read for existing methods. Existing methods directly works on single document. Thus to overcome the limitations of existing methods, we need to have efficient XML based type-ahead shear method. Recently we have studied one such method, which is called as TASX (pronounced “task”). This is fuzzy type-ahead search method in XML data. This method searches the XML data during the typing of keyword from user end and it searches XML data even if it’s misspelled. Experimentally this method showing efficient performance as compared to existing methods, but there are still suggestions over this method for improvement. Here, we are presenting extended approach for XML based type-ahead search method ATASX (pronounced “a task”). In this method we are proposing to use improved forward-index structure method with aim of improving the search efficiency it reduces searching time and provides result quality

    Qualitative Effects of Knowledge Rules in Probabilistic Data Integration

    Get PDF
    One of the problems in data integration is data overlap: the fact that different data sources have data on the same real world entities. Much development time in data integration projects is devoted to entity resolution. Often advanced similarity measurement techniques are used to remove semantic duplicates from the integration result or solve other semantic conflicts, but it proofs impossible to get rid of all semantic problems in data integration. An often-used rule of thumb states that about 90% of the development effort is devoted to solving the remaining 10% hard cases. In an attempt to significantly decrease human effort at data integration time, we have proposed an approach that stores any remaining semantic uncertainty and conflicts in a probabilistic database enabling it to already be meaningfully used. The main development effort in our approach is devoted to defining and tuning knowledge rules and thresholds. Rules and thresholds directly impact the size and quality of the integration result. We measure integration quality indirectly by measuring the quality of answers to queries on the integrated data set in an information retrieval-like way. The main contribution of this report is an experimental investigation of the effects and sensitivity of rule definition and threshold tuning on the integration quality. This proves that our approach indeed reduces development effort — and not merely shifts the effort to rule definition and threshold tuning — by showing that setting rough safe thresholds and defining only a few rules suffices to produce a ‘good enough’ integration that can be meaningfully used

    XPath-based information extraction

    Get PDF
    • 

    corecore