37 research outputs found
Structural Deep Embedding for Hyper-Networks
Network embedding has recently attracted lots of attentions in data mining.
Existing network embedding methods mainly focus on networks with pairwise
relationships. In real world, however, the relationships among data points
could go beyond pairwise, i.e., three or more objects are involved in each
relationship represented by a hyperedge, thus forming hyper-networks. These
hyper-networks pose great challenges to existing network embedding methods when
the hyperedges are indecomposable, that is to say, any subset of nodes in a
hyperedge cannot form another hyperedge. These indecomposable hyperedges are
especially common in heterogeneous networks. In this paper, we propose a novel
Deep Hyper-Network Embedding (DHNE) model to embed hyper-networks with
indecomposable hyperedges. More specifically, we theoretically prove that any
linear similarity metric in embedding space commonly used in existing methods
cannot maintain the indecomposibility property in hyper-networks, and thus
propose a new deep model to realize a non-linear tuplewise similarity function
while preserving both local and global proximities in the formed embedding
space. We conduct extensive experiments on four different types of
hyper-networks, including a GPS network, an online social network, a drug
network and a semantic network. The empirical results demonstrate that our
method can significantly and consistently outperform the state-of-the-art
algorithms.Comment: Accepted by AAAI 1
Helmholtz Portfolio Theme Large-Scale Data Management and Analysis (LSDMA)
The Helmholtz Association funded the "Large-Scale Data Management and Analysis" portfolio theme from 2012-2016. Four Helmholtz centres, six universities and another research institution in Germany joined to enable data-intensive science by optimising data life cycles in selected scientific communities. In our Data Life cycle Labs, data experts performed joint R&D together with scientific communities. The Data Services Integration Team focused on generic solutions applied by several communities
Advances in knowledge discovery and data mining Part II
19th Pacific-Asia Conference, PAKDD 2015, Ho Chi Minh City, Vietnam, May 19-22, 2015, Proceedings, Part II</p
COMMUNITY DETECTION IN GRAPHS
Thesis (Ph.D.) - Indiana University, Luddy School of Informatics, Computing, and Engineering/University Graduate School, 2020Community detection has always been one of the fundamental research topics in graph mining. As a type of unsupervised or semi-supervised approach, community detection aims to explore node high-order closeness by leveraging graph topological structure. By grouping similar nodes or edges into the same community while separating dissimilar ones apart into different communities, graph structure can be revealed in a coarser resolution. It can be beneficial for numerous applications such as user shopping recommendation and advertisement in e-commerce, protein-protein interaction prediction in the bioinformatics, and literature recommendation or scholar collaboration in citation
analysis. However, identifying communities is an ill-defined problem. Due to the No Free Lunch theorem [1], there is neither gold standard to represent perfect community partition nor universal methods that are able to detect satisfied communities for all tasks under various types of graphs. To have a global view of this research topic, I summarize state-of-art community detection methods by categorizing them based on graph types, research tasks and methodology frameworks. As academic exploration on community detection grows rapidly in recent years, I hereby particularly focus on the state-of-art works published in the latest decade, which may leave out some classic models published decades ago. Meanwhile, three subtle community detection tasks are proposed and assessed in this dissertation as well. First, apart from general models which consider only graph structures, personalized community detection considers user need as auxiliary information to guide community detection. In the end, there will be fine-grained communities for nodes better matching user needs while coarser-resolution communities for the rest of less relevant nodes. Second, graphs always suffer from the sparse connectivity issue. Leveraging conventional models directly on such graphs may hugely distort the quality of generate communities. To tackle such a problem, cross-graph techniques are involved to propagate external graph information as a support for target graph community detection. Third, graph community structure supports a natural language processing (NLP) task to depict node intrinsic characteristics by generating node summarizations via a text generative model. The contribution of this dissertation is threefold. First, a decent amount of researches are reviewed and summarized under a well-defined taxonomy. Existing works about methods, evaluation and applications are all addressed in the literature review. Second, three novel community detection tasks are demonstrated and associated models are proposed and evaluated by comparing with state-of-art baselines under various datasets. Third, the limitations of current works are pointed out and future research tracks with potentials are discussed as well