Search CORE

20 research outputs found

Learning Representations of Bi-level Knowledge Graphs for Reasoning beyond Link Prediction

Author: Chung Chanyoung
Whang Joyce Jiyoung
Publication venue
Publication date: 21/03/2023
Field of study

Knowledge graphs represent known facts using triplets. While existing knowledge graph embedding methods only consider the connections between entities, we propose considering the relationships between triplets. For example, let us consider two triplets

T_1

and

T_2

where

T_1

is (Academy_Awards, Nominates, Avatar) and

T_2

is (Avatar, Wins, Academy_Awards). Given these two base-level triplets, we see that

T_1

is a prerequisite for

T_2

. In this paper, we define a higher-level triplet to represent a relationship between triplets, e.g.,

\langle T_1

, PrerequisiteFor,

T_2\rangle

where PrerequisiteFor is a higher-level relation. We define a bi-level knowledge graph that consists of the base-level and the higher-level triplets. We also propose a data augmentation strategy based on the random walks on the bi-level knowledge graph to augment plausible triplets. Our model called BiVE learns embeddings by taking into account the structures of the base-level and the higher-level triplets, with additional consideration of the augmented triplets. We propose two new tasks: triplet prediction and conditional link prediction. Given a triplet

T_1

and a higher-level relation, the triplet prediction predicts a triplet that is likely to be connected to

T_1

by the higher-level relation, e.g.,

\langle T_1

, PrerequisiteFor, ?

\rangle

. The conditional link prediction predicts a missing entity in a triplet conditioned on another triplet, e.g.,

\langle T_1

, PrerequisiteFor, (Avatar, Wins, ?)

\rangle

. Experimental results show that BiVE significantly outperforms all other methods in the two new tasks and the typical base-level link prediction in real-world bi-level knowledge graphs.Comment: 14 pages, 3 figures, 15 tables. 37th AAAI Conference on Artificial Intelligence (AAAI 2023

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

Recommended from our members

Overlapping community detection in massive social networks

Author: Whang Joyce Jiyoung
Publication venue
Publication date: 11/02/2016
Field of study

Massive social networks have become increasingly popular in recent years. Community detection is one of the most important techniques for the analysis of such complex networks. A community is a set of cohesive vertices that has more connections inside the set than outside. In many social and information networks, these communities naturally overlap. For instance, in a social network, each vertex in a graph corresponds to an individual who usually participates in multiple communities. In this thesis, we propose scalable overlapping community detection algorithms that effectively identify high quality overlapping communities in various real-world networks. We first develop an efficient overlapping community detection algorithm using a seed set expansion approach. The key idea of this algorithm is to find good seeds and then greedily expand these seeds using a personalized PageRank clustering scheme. Experimental results show that our algorithm significantly outperforms other state-of-the-art overlapping community detection methods in terms of run time, cohesiveness of communities, and ground-truth accuracy. To develop more principled methods, we formulate the overlapping community detection problem as a non-exhaustive, overlapping graph clustering problem where clusters are allowed to overlap with each other, and some nodes are allowed to be outside of any cluster. To tackle this non-exhaustive, overlapping clustering problem, we propose a simple and intuitive objective function that captures the issues of overlap and non-exhaustiveness in a unified manner. To optimize the objective, we develop not only fast iterative algorithms but also more sophisticated algorithms using a low-rank semidefinite programming technique. Our experimental results show that the new objective and the algorithms are effective in finding ground-truth clusterings that have varied overlap and non-exhaustiveness. We extend our non-exhaustive, overlapping clustering techniques to co-clustering where the goal is to simultaneously identify a clustering of the rows as well as the columns of a data matrix. As an example application, consider recommender systems where users have ratings on items. This can be represented by a bipartite graph where users and items are denoted by two different types of nodes, and the ratings are denoted by weighted edges between the users and the items. In this case, co-clustering would be a simultaneous clustering of users and items. We propose a new co-clustering objective function and an efficient co-clustering algorithm that is able to identify overlapping clusters as well as outliers on both types of the nodes in the bipartite graph. We show that our co-clustering algorithm is able to effectively capture the underlying co-clustering structure of the data, which results in boosting the performance of a standard one-dimensional clustering. Finally, we study the design of parallel data-driven algorithms, which enables us to further increase the scalability of our overlapping community detection algorithms. Using PageRank as a model problem, we look at three algorithm design axes: work activation, data access pattern, and scheduling. We investigate the impact of different algorithm design choices. Using these design axes, we design and test a variety of PageRank implementations finding that data-driven, push-based algorithms are able to achieve a significantly superior scalability than standard PageRank implementations. The design choices affect both single-threaded performance as well as parallel scalability. The lessons learned from this study not only guide efficient implementations of many graph mining algorithms but also provide a framework for designing new scalable algorithms, especially for large-scale community detection.Computer Science

Texas ScholarWorks

InGram: Inductive Knowledge Graph Embedding via Relation Graphs

Author: Chung Chanyoung
Lee Jaejun
Whang Joyce Jiyoung
Publication venue
Publication date: 01/06/2023
Field of study

Inductive knowledge graph completion has been considered as the task of predicting missing triplets between new entities that are not observed during training. While most inductive knowledge graph completion methods assume that all entities can be new, they do not allow new relations to appear at inference time. This restriction prohibits the existing methods from appropriately handling real-world knowledge graphs where new entities accompany new relations. In this paper, we propose an INductive knowledge GRAph eMbedding method, InGram, that can generate embeddings of new relations as well as new entities at inference time. Given a knowledge graph, we define a relation graph as a weighted graph consisting of relations and the affinity weights between them. Based on the relation graph and the original knowledge graph, InGram learns how to aggregate neighboring embeddings to generate relation and entity embeddings using an attention mechanism. Experimental results show that InGram outperforms 14 different state-of-the-art methods on varied inductive learning scenarios.Comment: 14 pages, 4 figures, 6 tables, 40th International Conference on Machine Learning (ICML 2023

arXiv.org e-Print Archive

Representation Learning on Hyper-Relational and Numeric Knowledge Graphs with Transformers

Author: Chung Chanyoung
Lee Jaejun
Whang Joyce Jiyoung
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/06/2023
Field of study

A hyper-relational knowledge graph has been recently studied where a triplet is associated with a set of qualifiers; a qualifier is composed of a relation and an entity, providing auxiliary information for a triplet. While existing hyper-relational knowledge graph embedding methods assume that the entities are discrete objects, some information should be represented using numeric values, e.g., (J.R.R., was born in, 1892). Also, a triplet (J.R.R., educated at, Oxford Univ.) can be associated with a qualifier such as (start time, 1911). In this paper, we propose a unified framework named HyNT that learns representations of a hyper-relational knowledge graph containing numeric literals in either triplets or qualifiers. We define a context transformer and a prediction transformer to learn the representations based not only on the correlations between a triplet and its qualifiers but also on the numeric information. By learning compact representations of triplets and qualifiers and feeding them into the transformers, we reduce the computation cost of using transformers. Using HyNT, we can predict missing numeric values in addition to missing entities or relations in a hyper-relational knowledge graph. Experimental results show that HyNT significantly outperforms state-of-the-art methods on real-world datasets.Comment: 11 pages, 5 figures, 12 tables. 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD 2023

arXiv.org e-Print Archive

Localized Ranking in Social and Information Networks

Author: Joyce Jiyoung WHANG
Yunseob SHIN
Publication venue: 'Institute of Electronics, Information and Communications Engineers (IEICE)'
Publication date: 01/01/2018
Field of study

Crossref

HiddenCPG: Large-Scale Vulnerable Clone Detection Using Subgraph Isomorphism of Code Property Graphs

Author: Son Sooel
Whang Joyce Jiyoung
Wi Seongil
Woo Sijae
Publication venue: Association for Computing Machinery
Publication date: 25/04/2022
Field of study

ScholarWorks@UNIST

Scalable and Memory-Efficient Clustering of Large-Scale Social Networks

Author: Inderjit S. Dhillon
Joyce Jiyoung Whang
Xin Sui
Publication venue
Publication date
Field of study

Abstract—Clustering of social networks is an important task for their analysis; however, most existing algorithms do not scale to the massive size of today’s social networks. A popular class of graph clustering algorithms for large-scale networks, such as PMetis, KMetis and Graclus, is based on a multilevel framework. Generally, these multilevel algorithms work reasonably well on networks with a few million vertices. However, when the network size increases to the scale of 10 million vertices or greater, the performance of these algorithms rapidly degrades. Furthermore, an inherent property of social networks, the power law degree distribution, makes these algorithms infeasible to apply to large-scale social networks. In this paper, we propose a scalable and memory-efficient clustering algorithm for large-scale social networks. We name our algorithm GEM, by mixing two key concepts of the algorithm, Graph Extraction and weighted kernel k-Means. GEM efficiently extracts a good skeleton graph from the original graph, and propagates the clustering result of the extracted graph to the rest of the network. Experimental results show that GEM produces clusters of quality comparable to or better than existing state-of-the-art graph clustering algorithms, while it is much faster and consumes much less memory. Furthermore, the parallel implementation of GEM, called PGEM, not only produces higher quality of clusters but also achieves much better scalability than most current parallel graph clustering algorithms. Keywords-clustering; social networks; graph clustering; scalable computing; graph partitioning; kernel k-means; I

CiteSeerX

Crossref

Overlapping Community Detection Using Seed Set Expansion

Author: David F. Gleich
Inderjit S. Dhillon
Joyce Jiyoung Whang
Publication venue
Publication date: 01/01/2013
Field of study

Community detection is an important task in network analysis. A community (also referred to as a cluster) is a set of cohesive vertices that have more connections inside the set than outside. In many social and information networks, these communities naturally overlap. For instance, in a social network, each vertex in a graph corresponds to an individual who usually participates in multiple communities. One of the most successful techniques for finding overlapping communities is based on local optimization and expansion of a community metric around a seed set of vertices. In this paper, we propose an efficient overlapping community detection algorithm using a seed set expansion approach. In particular, we develop new seeding strategies for a personalized PageRank scheme that optimizes the conductance community score. The key idea of our algorithm is to find good seeds, and then expand these seed sets using the personalized PageRank clustering procedure. Experimental results show that this seed set expansion approach outperforms other state-of-the-art overlapping community detection methods. We also show that our new seeding strategies are better than previous strategies, and are thus effective in finding good overlapping clusters in a graph

CiteSeerX

Crossref