24 research outputs found

    Structure and content semantic similarity detection of eXtensible markup language documents using keys

    Get PDF
    XML (eXtensible Mark-up Language) has become the fundamental standard for efficient data management and exchange. Due to the widespread use of XML for describing and exchanging data on the web, XML-based comparison is central issues in database management and information retrieval. In fact, although many heterogeneous XML sources have similar content, they may be described using different tag names and structures. This work proposes a series of algorithms for detection of structural and content changes among XML data. The first is an algorithm called XDoI (XML Data Integration Based on Content and Structure Similarity Using Keys) that clusters XML documents into subtrees using leaf-node parents as clustering points. This algorithm matches subtrees using the key concept and compares unmatched subtrees for similarities in both content and structure. The experimental results show that this approach finds much more accurate matches with or without the presence of keys in the subtrees. A second algorithm proposed here is called XDI-CSSK (a system for detecting xml similarity in content and structure using relational database); it eliminates unnecessary clustering points using instance statistics and a taxonomic analyzer. As the number of subtrees to be compared is reduced, the overall execution time is reduced dramatically. Semantic similarity plays a crucial role in precise computational similarity measures. A third algorithm, called XML-SIM (structure and content semantic similarity detection using keys) is based on previous work to detect XML semantic similarity based on structure and content. This algorithm is an improvement over XDI-CSSK and XDoI in that it determines content similarity based on semantic structural similarity. In an experimental evaluation, it outperformed previous approaches in terms of both execution time and false positive rates. Information changes periodically; therefore, it is important to be able to detect changes among different versions of an XML document and use that information to identify semantic similarities. Finally, this work introduces an approach to detect XML similarity and thus to join XML document versions using a change detection mechanism. In this approach, subtree keys still play an important role in order to avoid unnecessary subtree comparisons within multiple versions of the same document. Real data sets from bibliographic domains demonstrate the effectiveness of all these algorithms --Abstract, page iv-v

    Tree mining application to matching of hetereogeneous knowledge

    Get PDF
    Matching of heterogeneous knowledge sources is of increasing importance in areas such as scientific knowledge management, e-commerce, enterprise application integration, and many emerging Semantic Web applications. With the desire of knowledge sharing and reuse in these fields, it is common that the knowledge coming from different organizations from the same domain is to be matched. We propose a knowledge matching method based on our previously developed tree mining algorithms for extracting frequently occurring subtrees from a tree structured database such as XML. Using the method the common structure among the different representations can be automatically extracted. Our focus is on knowledge matching at the structural level and we use a set of example XML schema documents from the same domain to evaluate the method. We discuss some important issues that arise when applying tree mining algorithms for detection of common document structures. The experiments demonstrate the usefulness of the approach

    A first exploration of an inductive analysis approach for detecting learning design patterns

    Get PDF
    Please cite as: Francis Brouns, Rob Koper, Jocelyn Manderveld, Jan van Bruggen, Peter Sloep, Peter van Rosmalen, Colin Tattersall and Hubert Vogten (2005). A first exploration of an inductive analysis approach for detecting learning design patterns. Journal of Interactive Media in Education (Advances in Learning Design. Special Issue, eds. Colin Tattersall, Rob Koper), 2005/03. ISSN:1365-893X [http://jime.open.ac.uk/2005/03]One way to develop effective online courses is the use of learning design patterns, since patterns capture successful solutions. Pedagogical patterns are commonly created by human cognitive processing in "writer's workshops". We explore two ideas; first whether IMS Learning Design is suitable for detecting patterns in existing courses and secondly whether the use of inductive analyses is a suitable approach. We expect patterns to occur in the method section of a learning design, because here the process of teaching and learning is defined. We provide some suggestions for inductive techniques that could be applied to existing learning designs in order to detect patterns and discuss how the patterns could be used to create new learning designs. None of the suggested approaches are validated yet, but are intended as input for the ongoing discussion on patterns

    Learning Design Patterns: Exploring an inductive analysis approach

    Get PDF
    Preprint of article submitted to the joint Unfold/Prolearn Workshop, September 2005; to be published in a Special Issue on Learning Design of the IEEE journal Educational Technology & Society.Learning design patterns assist the development of effective courses, because patterns capture successful solutions. Pedagogical patterns are commonly created by human cognitive processing in "writer's workshops". Inductive techniques could be used to detect or determine patterns in existing data, or learning designs. This assumes that the learning designs are available in a format that is machine interpretable. The IMS Learning Design specification enables the formal coding of learning designs. We explain that we expect patterns to occur in the method section of a learning design and in particular in acts. We explore several inductive techniques that could be applied to existing learning designs in order to detect and determine patterns and discuss how these could be applied to create new learning designs

    Approximate Matching of Hierarchial Data

    Get PDF

    A first exploration of an inductive analysis approach for detecting learning design patterns

    Get PDF
    Commentary on: Chapter 1: An Introduction to Learning Design. (Koper, 2005) Abstract: One way to develop effective online courses is the use of learning design patterns, since patterns capture successful solutions. Pedagogical patterns are commonly created by human cognitive processing in "writer's workshops". We explore two ideas; first whether IMS Learning Design is suitable for detecting patterns in existing courses and secondly whether the use of inductive analyses is a suitable approach. We expect patterns to occur in the method section of a learning design, because here the process of teaching and learning is defined. We provide some suggestions for inductive techniques that could be applied to existing learning designs in order to detect patterns and discuss how the patterns could be used to create new learning designs. None of the suggested approaches are validated yet, but are intended as input for the ongoing discussion on patterns. Editors: Colin Tattersall and Rob Koper

    Web service searching

    Get PDF
    With the growing number of Web services, it is no longer adequate to locate a Web service by searching its name or browsing a UDDI directory. An efficient Web services discovery mechanism is necessary for locating and selecting the required Web services. Searching mechanism should be based on Web service description rather than on keywords. In this work, we introduce a Web service searching prototype that can locate Web services by comparing all available information encoded in Web service description, such as operation name, input and output types, the structure of the underlying XML schema, and the semantic of element names. Our approach combines information-retrieval techniques, weighted bipartite graph matching algorithm and tree-matching algorithm. Given a query, represented as set of keywords, Web service description, or operation description, an information retrieval technique is used to rank the candidate Web services based on their text-base similarity to the query. The ranked result can be further refined by computing their structure similarity. (Abstract shortened by UMI.) Paper copy at Leddy Library: Theses & Major Papers - Basement, West Bldg. / Call Number: Thesis2005 .J34. Source: Masters Abstracts International, Volume: 44-03, page: 1403. Thesis (M.Sc.)--University of Windsor (Canada), 2005
    corecore