27 research outputs found

    Two-phase heuristics for the k-club problem

    Get PDF
    Given an undirected graph G and an integer k, a k-club is a subset of nodes that induces a subgraph with diameter at most k. The k-club problem consists of identifying a maximum cardinality k-club in G. It is an NP-hard problem. The problem of checking if a given k-club is maximal or if it is a subset of a larger k-club is also NP-hard, due to the non-hereditary nature of the k-club structure. This non-hereditary nature is adverse for heuristic strategies that rely on single-element add and delete operations. In this work we propose two-phase algorithms which combine simple construction schemes with exact optimization of restricted integer models to generate near optimal solutions for the k-club problem. Numerical experiments on sets of uniform random graphs with edge densities known to be very challenging and test instances available in the literature indicate that the new algorithms are quite effective, both in terms of solution quality and running times.info:eu-repo/semantics/publishedVersio

    Distance-generalized Core Decomposition

    Full text link
    The kk-core of a graph is defined as the maximal subgraph in which every vertex is connected to at least kk other vertices within that subgraph. In this work we introduce a distance-based generalization of the notion of kk-core, which we refer to as the (k,h)(k,h)-core, i.e., the maximal subgraph in which every vertex has at least kk other vertices at distance h\leq h within that subgraph. We study the properties of the (k,h)(k,h)-core showing that it preserves many of the nice features of the classic core decomposition (e.g., its connection with the notion of distance-generalized chromatic number) and it preserves its usefulness to speed-up or approximate distance-generalized notions of dense structures, such as hh-club. Computing the distance-generalized core decomposition over large networks is intrinsically complex. However, by exploiting clever upper and lower bounds we can partition the computation in a set of totally independent subcomputations, opening the door to top-down exploration and to multithreading, and thus achieving an efficient algorithm

    Polyhedral Combinatorics, Complexity & Algorithms for k-Clubs in Graphs

    Get PDF
    A k-club is a distance-based graph-theoretic generalization of clique, originally introduced to model cohesive subgroups in social network analysis. The k-clubs represent low diameter clusters in graphs and are suitable for various graph-based data mining applications. Unlike cliques, the k-club model is nonhereditary, meaning every subset of a k-club is not necessarily a k-club. This imposes significant challenges in developing theory and algorithms for optimization problems associated with k-clubs.We settle an open problem establishing the intractability of testing inclusion-wise maximality of k-clubs for fixed k>=2. This result is in contrast to polynomial-time verifiability of maximal cliques, and is a direct consequence of k-clubs' nonhereditary nature. A class of graphs for which this problem is polynomial-time solvable is also identified. We propose a distance coloring based upper-bounding scheme and a bounded enumeration based lower-bounding routine and employ them in a combinatorial branch-and-bound algorithm for finding a maximum k-club. Computational results on graphs with up to 200 vertices are also provided.The 2-club polytope of a graph is studied and a new family of facet inducing inequalities for this polytope is discovered. This family of facets strictly contains all known nontrivial facets of the 2-club polytope as special cases, and identifies previously unknown facets of this polytope. The separation complexity of these newly discovered facets is proved to be NP-complete and it is shown that the 2-club polytope of trees can be completely described by the collection of these facets along with the nonnegativity constraints.We also studied the maximum 2-club problem under uncertainty. Given a random graph subject to probabilistic edge failures, we are interested in finding a large "risk-averse" 2-club. Here, risk-aversion is achieved via modeling the loss in 2-club property due to edge failures, as random loss, which is a function of the decision variables and uncertain parameters. Conditional Value-at-Risk (CVaR) is used as a quantitative measure of risk that is constrained in the model. Benders' decomposition scheme is utilized to develop a new decomposition algorithm for solving the CVaR constrainedmaximum 2-club problem. A preliminary experiment is also conducted to compare the computational performance of the developed algorithm with our extension of an existing algorithm from the literature.Industrial Engineering & Managemen

    Finding Second-Order Clubs

    Get PDF
    Modeling data entities and their pairwise relationships as a graph is a popular technique to visualizing and mining information from datasets in a variety of fields such as social networks, biological networks, web graphs, and document networks. A powerful technique in this setting involves the detection of clusters. Clique, a subset of pairwise adjacent vertices, is often viewed as an idealized representation of a cluster. However, in the presence of errors in the data on which the graph is based, clique requirement may be too restrictive, resulting in small clusters or clusters that miss key members. Consequently, graph-theoretic clique generalizations based on the principle of relaxing elementary structural properties of a clique have been proposed in diverse fields to describe clusters of interest. For example, an s-club is a distance-based clique relaxation originally introduced in social network analysis to model cohesive social subgroups. In this dissertation, we consider low-diameter clusters that require another property like robustness, heredity, or connectedness (parameterized by r) to hold, in addition to the diameter. Specifically, we study s-clubs with side-constraints to make them less “fragile”, i.e., less susceptible to increase in the diameter if vertices (and edges) are deleted. The overall goal of this dissertation is to develop effective exact algorithms with an emphasis on s = 2, 3, 4 and low values of r to solve the maximum r-robust s-club and r-hereditary s-club problems on moderately large instances (around 10,000 vertices and less than 5% density). We analyze the complexity of the associated feasibility testing and optimization problems. Cut-like formulations are proposed for the maximum r-robust s-club problem with r ≥ 2 and s ∈ {2, 3, 4}. We explore preprocessing techniques and develop a graph decomposition approach for solving such problems. The computational benefits of each of the algorithmic ideas are empirically evaluated through our computational studies. Our approach permits us to solve problems optimally on very large and sparse real-life networks

    Network interdiction approaches for diminishing misinformation spread in social networks

    Get PDF
    Network interdiction has many applications in many domains, including telecommunications, epidemic control, and social network analysis. In this dissertation, we use network interdiction to devise strategies for the problem of misinformation dissemination in online social networks. These platforms provide the opportunity of quick communication between users, which in a network with malicious accounts can result in the fast spread of rumors and harmful content. We study this topic based on two different approaches. The first approach focuses on interdicting cohesive subgroups of malicious accounts. We use s-clubs, which are subsets of vertices that induce subgraphs of diameter at most s to model the cohesive social subgroups. We consider a defender that can disrupt the vertices of the adversarial network to minimize its threat, which leads us to consider a maximum s-club interdiction problem. Using a new notion of H-heredity in s-clubs, we provide a mixed-integer linear programming formulation for this problem that uses far fewer constraints than the formulation based on standard techniques. We further relate H-heredity to latency-s connected dominating sets and design a decomposition branch-and-cut algorithm for the problem. The second methodology that is studied in this dissertation is to delay the spread of misinformation in the network using first passage times interdiction. The first passage times are defined as the first time each user is exposed to a post shared by another user in the network and is computed using a discrete time Markov chain model. Vertices are interdicted to modify the transition probabilities and increase the propagation times between users who share misinformation and harmful content, and vulnerable users. We show that the problem is NP-hard and provide a mixed-integer linear programming formulation for it. Computational experiments on benchmark instances are conducted for both interdiction approaches based on cohesive subgroups and first passage times in order to assess the computational capabilities of the methods we introduced

    Experimental Evaluation of Approximation and Heuristic Algorithms for Maximum Distance-Bounded Subgraph Problems

    Get PDF
    In this paper, we consider two distance-based relaxed variants of the maximum clique problem (Max Clique), named Maxd-Clique and Maxd-Club for positive integers d. Max 1-Clique and Max 1-Club cannot be efficiently approximated within a factor of n1−ε for any real ε>0 unless P=NP , since they are identical to Max Clique (Håstad in Acta Math 182(1):105–142, 1999; Zuckerman in Theory Comput 3:103–128, 2007). In addition, it is NP -hard to approximate Maxd-Clique and Maxd-Club to within a factor of n1/2−ε for any fixed integer d≥2 and any real ε>0 (Asahiro et al. in Approximating maximum diameter-bounded subgraphs. In: Proc of LATIN 2010, Springer, pp 615–626, 2010; Asahiro et al. in Optimal approximation algorithms for maximum distance-bounded subgraph problems. In: Proc of COCOA, Springer, pp 586–600, 2015). As for approximability of Maxd-Clique and Maxd-Club, a polynomial-time algorithm, called ReFindStar d, that achieves an optimal approximation ratio of O(n1/2) for Maxd-Clique and Maxd-Club was designed for any integer d≥2 in Asahiro et al. (2015, Algorithmica 80(6):1834–1856, 2018). Moreover, a simpler algorithm, called ByFindStar d, was proposed and it was shown in Asahiro et al. (2010, 2018) that although the approximation ratio of ByFindStar d is much worse for any odd d≥3, its time complexity is better than ReFindStar d. In this paper, we implement those approximation algorithms and evaluate their quality empirically for random graphs. The experimental results show that (1) ReFindStar d can find larger d-clubs (d-cliques) than ByFindStar d for odd d, (2) the size of d-clubs (d-cliques) output by ByFindStar d is the same as ones by ReFindStar d for even d, and (3) ByFindStar d can find the same size of d-clubs (d-cliques) much faster than ReFindStar d. Furthermore, we propose and implement two new heuristics, Hclub d for Maxd-Club and Hclique d for Maxd-Clique. Then, we present the experimental evaluation of the solution size of ReFindStar d, Hclub d, Hclique d and previously known heuristic algorithms for random graphs and Erdős collaboration graphs

    Decomposition algorithms for detecting low-diameter clusters in graphs

    Get PDF
    Detecting low-diameter clusters in graphs is an effective graph-based data mining technique, which has been used to find cohesive subgraphs in a variety of graph models of data. Low pairwise distances within a cluster can facilitate fast communication or good reachability between vertices in the cluster. A k-club is a subset of vertices, which induces a subgraph of diameter at most k. For low values of the parameter k, this model offers a graph-theoretic relaxation of the clique model that formalizes the notion of a low-diameter cluster. The maximum k-club problem is to find a k-club with maximum cardinality in a given graph. The goals of this study are focused on developing decomposition and cutting plane methods for the maximum k-club problem for arbitrary k.Two compact integer programming formulations for the maximum k-club problem were presented by other researchers. These formulations are very effective integer programming approaches presently available to solve the maximum k-club problem for any given value of k. Using model decomposition techniques, we demonstrate how the fundamental optimization problem of finding a maximum size k-club can be solved optimally on large-scale benchmark instances. Our approach circumvents the use of complicated formulations in favor of a simple relaxation based on necessary conditions, combined with canonical hypercube cuts introduced by Balas and Jeroslow. Next, we demonstrate that by using a delayed constraint generation approach in a branch-and-cut algorithm, we can significantly speed-up the performance of an integer programming solver over the direct solution of the implementation of either formulation.Then, we study the problem of detecting large risk-averse 2-clubs in graphs subject to probabilistic edge failures. To achieve risk aversion, we first model the loss in 2-club property due to probabilistic edge failures as a function of the decision (chosen 2-club cluster) and randomness (graph structure). Then, we utilize the conditional value-at-risk of the loss for a given decision as a quantitative measure of risk, which is bounded in the stochastic optimization model. A sequential cutting plane method that solves a series of mixed integer linear programs is developed for solving this problem
    corecore