85 research outputs found

    Resolving XML Semantic Ambiguity

    Get PDF
    ABSTRACT XML semantic-aware processing has become a motivating and important challenge in Web data management, data processing, and information retrieval. While XML data is semi-structured, yet it remains prone to lexical ambiguity, and thus requires dedicated semantic analysis and sense disambiguation processes to assign well-defined meaning to XML elements and attributes. This becomes crucial in an array of applications ranging over semantic-aware query rewriting, semantic document clustering and classification, schema matching, as well as blog analysis and event detection in social networks and tweets. Most existing approaches in this context: i) ignore the problem of identifying ambiguous XML nodes, ii) only partially consider their structural relations/context, iii) use syntactic information in processing XML data regardless of the semantics involved, and iv) are static in adopting fixed disambiguation constraints thus limiting user involvement. In this paper, we provide a new XML Semantic Disambiguation Framework titled XSDF designed to address each of the above motivations, taking as input: an XML document and a general purpose semantic network, and then producing as output a semantically augmented XML tree made of unambiguous semantic concepts. Experiments demonstrate the effectiveness of our approach in comparison with alternative methods. Categories and Subject Descriptors General Terms Algorithms, Measurement, Performance, Design, Experimentation. Keywords XML semantic-aware processing, a m b i g u i t y d e g r e e , s p h e r e neighborhood, XML context vector, semantic network, semantic disambiguation

    SemIndex: Semantic-Aware Inverted Index

    Get PDF
    [email protected] paper focuses on the important problem of semanticaware search in textual (structured, semi-structured, NoSQL) databases. This problem has emerged as a required extension of the standard containment keyword based query to meet user needs in textual databases and IR applications. We provide here a new approach, called SemIndex, that extends the standard inverted index by constructing a tight coupling inverted index graph that combines two main resources: a general purpose semantic network, and a standard inverted index on a collection of textual data. We also provide an extended query model and related processing algorithms with the help of SemIndex. To investigate its effectiveness, we set up experiments to test the performance of SemIndex. Preliminary results have demonstrated the effectiveness, scalability and optimality of our approach.This study is partly funded by: Bourgogne Region program, CNRS, and STIC AmSud project Geo-Climate XMine, and LAU grant SOERC-1314T012.Revisión por pare

    A survey on tree matching and XML retrieval

    Get PDF
    International audienceWith the increasing number of available XML documents, numerous approaches for retrieval have been proposed in the literature. They usually use the tree representation of documents and queries to process them, whether in an implicit or explicit way. Although retrieving XML documents can be considered as a tree matching problem between the query tree and the document trees, only a few approaches take advantage of the algorithms and methods proposed by the graph theory. In this paper, we aim at studying the theoretical approaches proposed in the literature for tree matching and at seeing how these approaches have been adapted to XML querying and retrieval, from both an exact and an approximate matching perspective. This study will allow us to highlight theoretical aspects of graph theory that have not been yet explored in XML retrieval

    Almost Linear Semantic XML Keyword Search

    No full text
    International audienc

    XA2C: a framework for manipulating XML data

    No full text
    International audiencePurpose - XML has spread beyond the computer science fields and reached other areas such as, e-commerce, identification, information storage, instant messaging and others. Data communicated over these domains are now mainly based on XML. Thus, allowing non-expert programmers to manipulate and control their XML data is essential. The purpose of this paper is to present an XA2C framework intended for both non-expert and expert programmers and provide them with means to write/draw their XML data manipulation operations. Design/methodology/approach - In the literature, this issue has been dealt with from two perspectives: first, XML alteration/adaptation techniques requiring a certain level of expertise to be implemented and are not unified yet; and second, Mashups, which are not formally defined yet and are not specific to XML data, and XML-oriented visual languages are based on structural transformations and data extraction mainly and do not allow XML textual data manipulations. The paper discusses existing approaches and the XA2C framework is presented. Findings - The framework is defined based on the dataflow paradigm (visual diagram compositions) while taking advantage of both Mashups and XML-oriented visual languages by defining a well-founded modular architecture and an XML-oriented visual functional composition language based on colored petri nets allowing functional compositions. The framework takes advantage of existing XML alteration/adaptation techniques by defining them as XML-oriented manipulation functions. A prototype called XA2C is developed and presented here for testing and validating the authors' approach. Originality/value - This paper presents a detailed description of an XML-oriented manipulation framework implementing the XML-oriented composition definition language

    Resolving XML Semantic Ambiguity

    No full text
    International audienceXML semantic-aware processing has become a motivating and important challenge in Web data management, data processing, and information retrieval. While XML data is semi-structured, yet it remains prone to lexical ambiguity, and thus requires dedicated semantic analysis and sense disambiguation processes to assign well-defined meaning to XML elements and attributes. This becomes crucial in an array of applications ranging over semantic-aware query rewriting, semantic document clustering and classification, schema matching, as well as blog analysis and event detection in social networks and tweets. Most existing approaches in this context: i) ignore the problem of identifying ambiguous XML nodes, ii) only partially consider their structural relations/context, iii) use syntactic information in processing XML data regardless of the semantics involved, and iv) are static in adopting fixed disambiguation constraints thus limiting user involvement. In this paper, we provide a new XML Semantic Disambiguation Framework titled XSDF designed to address each of the above motivations, taking as input: an XML document and a general purpose semantic network, and then producing as output a semantically augmented XML tree made of unambiguous semantic concepts. Experiments demonstrate the effectiveness of our approach in comparison with alternative method

    XSDF: A System for XML Semantic Disambiguation

    No full text
    International audienceThis paper briefly describes and evaluates XSDF, a new XML Semantic Disambiguation Framework, taking as input: an XML document and a general purpose semantic network, and then producing as output a semantically augmented XML tree made of unambiguous semantic concepts. Experiments demonstrate the effectiveness of XSDF in comparison with alternative methods
    corecore