8 research outputs found

    GREASE: A Generative Model for Relevance Search over Knowledge Graphs

    Get PDF
    Relevance search is to find top-ranked entities in a knowledge graph (KG) that are relevant to a query entity. Relevance is ambiguous, particularly over a schema-rich KG like DBpedia which supports a wide range of different semantics of relevance based on numerous types of relations and attributes. As users may lack the expertise to formalize the desired semantics, supervised methods have emerged to learn the hidden user-defined relevance from user-provided examples. Along this line, in this paper we propose a novel generative model over KGs for relevance search, named GREASE. The model applies to meta-path based relevance where a meta-path characterizes a particular type of semantics of relating the query entity to answer entities. It is also extended to support properties that constrain answer entities. Extensive experiments on two large-scale KGs demonstrate that GREASE has advanced the state of the art in effectiveness, expressiveness, and efficiency.Comment: 9 pages, accepted to WSDM 202

    Top-k Similarity Join in Heterogeneous Information Networks

    No full text

    A framework for dynamic heterogeneous information networks change discovery based on knowledge engineering and data mining methods

    Get PDF
    Information Networks are collections of data structures that are used to model interactions in social and living phenomena. They can be either homogeneous or heterogeneous and static or dynamic depending upon the type and nature of relations between the network entities. Static, homogeneous and heterogenous networks have been widely studied in data mining but recently, there has been renewed interest in dynamic heterogeneous information networks (DHIN) analysis because the rich temporal, structural and semantic information is hidden in this kind of network. The heterogeneity and dynamicity of the real-time networks offer plenty of prospects as well as a lot of challenges for data mining. There has been substantial research undertaken on the exploration of entities and their link identification in heterogeneous networks. However, the work on the formal construction and change mining of heterogeneous information networks is still infant due to its complex structure and rich semantics. Researchers have used clusters-based methods and frequent pattern-mining techniques in the past for change discovery in dynamic heterogeneous networks. These methods only work on small datasets, only provide the structural change discovery and fail to consider the quick and parallel process on big data. The problem with these methods is also that cluster-based approaches provide the structural changes while the pattern-mining provide semantic characteristics of changes in a dynamic network. Another interesting but challenging problem that has not been considered by past studies is to extract knowledge from these semantically richer networks based on the user-specific constraint.This study aims to develop a new change mining system ChaMining to investigate dynamic heterogeneous network data, using knowledge engineering with semantic web technologies and data mining to overcome the problems of previous techniques, this system and approach are important in academia as well as real-life applications to support decision-making based on temporal network data patterns. This research has designed a novel framework “ChaMining” (i) to find relational patterns in dynamic networks locally and globally by employing domain ontologies (ii) extract knowledge from these semantically richer networks based on the user-specific (meta-paths) constraints (iii) Cluster the relational data patterns based on structural properties of nodes in the dynamic network (iv) Develop a hybrid approach using knowledge engineering, temporal rule mining and clustering to detect changes in the dynamic heterogeneous networks.The evidence is presented in this research shows that the proposed framework and methods work very efficiently on the benchmark big dynamic heterogeneous datasets. The empirical results can contribute to a better understanding of the rich semantics of DHIN and how to mine them using the proposed hybrid approach. The proposed framework has been evaluated with the previous six dynamic change detection algorithms or frameworks and it performs very well to detect microscopic as well as macroscopic human-understandable changes. The number of change patterns extracted in this approach was higher than the previous approaches which help to reduce the information loss
    corecore