112,068 research outputs found
XML Schema Clustering with Semantic and Hierarchical Similarity Measures
With the growing popularity of XML as the data representation language, collections of the XML data are exploded in numbers. The methods are required to manage and discover the useful information from them for the improved document handling. We present a schema clustering process by organising the heterogeneous XML schemas into various groups. The methodology considers not only the linguistic and the context of the elements but also the hierarchical structural similarity. We support our findings with experiments and analysis
A Progressive Clustering Algorithm to Group the XML Data by Structural and Semantic Similarity
Since the emergence in the popularity of XML for data representation and exchange over the Web, the distribution of XML documents has rapidly increased. It has become a challenge for researchers to turn these documents into a more useful information utility. In this paper, we introduce a novel clustering algorithm PCXSS that keeps the heterogeneous XML documents into various groups according to their similar structural and semantic representations. We develop a global criterion function CPSim that progressively measures the similarity between a XML document and existing clusters, ignoring the need to compute the similarity between two individual documents. The experimental analysis shows the method to be fast and accurate
Hierarchical mutual information for the comparison of hierarchical community structures in complex networks
The quest for a quantitative characterization of community and modular
structure of complex networks produced a variety of methods and algorithms to
classify different networks. However, it is not clear if such methods provide
consistent, robust and meaningful results when considering hierarchies as a
whole. Part of the problem is the lack of a similarity measure for the
comparison of hierarchical community structures. In this work we give a
contribution by introducing the {\it hierarchical mutual information}, which is
a generalization of the traditional mutual information, and allows to compare
hierarchical partitions and hierarchical community structures. The {\it
normalized} version of the hierarchical mutual information should behave
analogously to the traditional normalized mutual information. Here, the correct
behavior of the hierarchical mutual information is corroborated on an extensive
battery of numerical experiments. The experiments are performed on artificial
hierarchies, and on the hierarchical community structure of artificial and
empirical networks. Furthermore, the experiments illustrate some of the
practical applications of the hierarchical mutual information. Namely, the
comparison of different community detection methods, and the study of the
consistency, robustness and temporal evolution of the hierarchical modular
structure of networks.Comment: 14 pages and 12 figure
Measuring the similarity of PML documents with RFID-based sensors
The Electronic Product Code (EPC) Network is an important part of the
Internet of Things. The Physical Mark-Up Language (PML) is to represent and
de-scribe data related to objects in EPC Network. The PML documents of each
component to exchange data in EPC Network system are XML documents based on PML
Core schema. For managing theses huge amount of PML documents of tags captured
by Radio frequency identification (RFID) readers, it is inevitable to develop
the high-performance technol-ogy, such as filtering and integrating these tag
data. So in this paper, we propose an approach for meas-uring the similarity of
PML documents based on Bayesian Network of several sensors. With respect to the
features of PML, while measuring the similarity, we firstly reduce the
redundancy data except information of EPC. On the basis of this, the Bayesian
Network model derived from the structure of the PML documents being compared is
constructed.Comment: International Journal of Ad Hoc and Ubiquitous Computin
- …