252 research outputs found

    Cohesive subgraph identification in large graphs

    Get PDF
    Graph data is ubiquitous in real world applications, as the relationship among entities in the applications can be naturally captured by the graph model. Finding cohesive subgraphs is a fundamental problem in graph mining with diverse applications. Given the important roles of cohesive subgraphs, this thesis focuses on cohesive subgraph identification in large graphs. Firstly, we study the size-bounded community search problem that aims to find a subgraph with the largest min-degree among all connected subgraphs that contain the query vertex q and have at least l and at most h vertices, where q, l, h are specified by the query. As the problem is NP-hard, we propose a branch-reduce-and-bound algorithm SC-BRB by developing nontrivial reducing techniques, upper bounding techniques, and branching techniques. Secondly, we formulate the notion of similar-biclique in bipartite graphs which is a special kind of biclique where all vertices from a designated side are similar to each other, and aim to enumerate all maximal similar-bicliques. We propose a backtracking algorithm MSBE to directly enumerate maximal similar-bicliques, and power it by vertex reduction and optimization techniques. In addition, we design a novel index structure to speed up a time-critical operation of MSBE, as well as to speed up vertex reduction. Efficient index construction algorithms are developed. Thirdly, we consider balanced cliques in signed graphs --- a clique is balanced if its vertex set can be partitioned into CL and CR such that all negative edges are between CL and CR --- and study the problem of maximum balanced clique computation. We propose techniques to transform the maximum balanced clique problem over G to a series of maximum dichromatic clique problems over small subgraphs of G. The transformation not only removes edge signs but also sparsifies the edge set

    Maximal Multipolarized Cliques Search in Signed Networks

    Get PDF
    This is the author accepted manuscript. The final version is available from ACM via the DOI in this recordThe increasing of group polarization on social media seriously impacts on the health of public discourse and information dissemination. At present, detecting polarized structures in signed networks is well-motivated for studying the group polarization on social media. However, most studies restricted the number of polarized structures to only two, while neglecting the real-world scenario where signed networks consist of multiple polarized structures, that is an unreasonable assumption. To conquer the limitations of the existing work, in this paper, we present a novel cohesive subgraph model based on structural clusterable theory, named maximal multipolarized clique (MMC), which can be partitioned into k polarized subcliques such that the edges in subcliques are positive and the edges between subcliques are negative. This paper formulates the problem of Maximal Multipolarized Cliques Search (MMCS) in signed networks which is proved to be NP-hard. To address this problem, we first devise powerful pruning rules to reduce the signed network significantly and further develop an efficient algorithm to search all maximal multipolarized cliques in the reduced signed network. The experimental results on real-world signed networks demonstrate the efficiency and effectiveness of our algorithm.Fundamental Research Funds for the Central Universitie

    Quantum Algorithm for Maximum Biclique Problem

    Full text link
    Identifying a biclique with the maximum number of edges bears considerable implications for numerous fields of application, such as detecting anomalies in E-commerce transactions, discerning protein-protein interactions in biology, and refining the efficacy of social network recommendation algorithms. However, the inherent NP-hardness of this problem significantly complicates the matter. The prohibitive time complexity of existing algorithms is the primary bottleneck constraining the application scenarios. Aiming to address this challenge, we present an unprecedented exploration of a quantum computing approach. Efficient quantum algorithms, as a crucial future direction for handling NP-hard problems, are presently under intensive investigation, of which the potential has already been proven in practical arenas such as cybersecurity. However, in the field of quantum algorithms for graph databases, little work has been done due to the challenges presented by the quantum representation of complex graph topologies. In this study, we delve into the intricacies of encoding a bipartite graph on a quantum computer. Given a bipartite graph with n vertices, we propose a ground-breaking algorithm qMBS with time complexity O^*(2^(n/2)), illustrating a quadratic speed-up in terms of complexity compared to the state-of-the-art. Furthermore, we detail two variants tailored for the maximum vertex biclique problem and the maximum balanced biclique problem. To corroborate the practical performance and efficacy of our proposed algorithms, we have conducted proof-of-principle experiments utilizing IBM quantum simulators, of which the results provide a substantial validation of our approach to the extent possible to date

    On detecting maximal quasi antagonistic communities in signed graphs

    Get PDF
    National Research Foundation (NRF) Singapor

    Efficient Enumeration of the Optimal Solutions to the Correlation Clustering problem

    Full text link
    According to the structural balance theory, a signed graph is considered structurally balanced when it can be partitioned into a number of modules such that positive and negative edges are respectively located inside and between the modules. In practice, real-world networks are rarely structurally balanced, though. In this case, one may want to measure the magnitude of their imbalance, and to identify the set of edges causing this imbalance. The correlation clustering (CC) problem precisely consists in looking for the signed graph partition having the least imbalance. Recently, it has been shown that the space of the optimal solutions of the CC problem can be constituted of numerous and diverse optimal solutions. Yet, this space is difficult to explore, as the CC problem is NP-hard, and exact approaches do not scale well even when looking for a single optimal solution. To alleviate this issue, in this work we propose an efficient enumeration method allowing to retrieve the complete space of optimal solutions of the CC problem. It combines an exhaustive enumeration strategy with neighborhoods of varying sizes, to achieve computational effectiveness. Results obtained for middle-sized networks confirm the usefulness of our method

    Balanced Butterfly Counting in Bipartite-Network

    Full text link
    Bipartite graphs offer a powerful framework for modeling complex relationships between two distinct types of vertices, incorporating probabilistic, temporal, and rating-based information. While the research community has extensively explored various types of bipartite relationships, there has been a notable gap in studying Signed Bipartite Graphs, which capture liking / disliking interactions in real-world networks such as customer-rating-product and senator-vote-bill. Balance butterflies, representing 2 x 2 bicliques, provide crucial insights into antagonistic groups, balance theory, and fraud detection by leveraging the signed information. However, such applications require counting balance butterflies which remains unexplored. In this paper, we propose a new problem: counting balance butterflies in a signed bipartite graph. To address this problem, we adopt state-of-the-art algorithms for butterfly counting, establishing a smart baseline that reduces the time complexity for solving our specific problem. We further introduce a novel bucket approach specifically designed to count balanced butterflies efficiently. We propose a parallelized version of the bucketing approach to enhance performance. Extensive experimental studies on nine real-world datasets demonstrate that our proposed bucket-based algorithm is up to 120x faster over the baseline, and the parallel implementation of the bucket-based algorithm is up to 45x faster over the single core execution. Moreover, a real-world case study showcases the practical application and relevance of counting balanced butterflies
    • …
    corecore