2,832 research outputs found

    Upper bounds and heuristics for the 2-club problem

    Get PDF
    Given an undirected graph G = (V, E), a k-club is a subset of V that induces a subgraph of diameter at most k. The k-club problem is that of finding the maximum cardinality k-club in G. In this paper we present valid inequalities for the 2-club polytope and derive conditions for them to define facets. These inequalities are the basis of a strengthened formulation for the 2-club problem and a cutting plane algorithm. The LP relaxation of the strengthened formulation is used to compute upper bounds on the problem’s optimum and to guide the generation of near-optimal solutions. Numerical experiments indicate that this approach is quite effective in terms of solution quality and speed, especially for low density graphs.info:eu-repo/semantics/publishedVersio

    Characterizing Structurally Cohesive Clusters in Networks: Theory and Algorithms

    Get PDF
    This dissertation aims at developing generalized network models and solution approaches for studying cluster detection problems that typically arise in networks. More specifically, we consider graph theoretic relaxations of clique as models for characterizing structurally cohesive and robust subgroups, developing strong upper bounds for the maximum clique problem, and present a new relaxation that is useful in clustering applications. We consider the clique relaxation models of k-block, and k-robust 2-club for describing cohesive clusters that are reliable and robust to disruptions, and introduce a new relaxation called s-stable cluster, for modeling stable clusters. First, we identify the structural properties associated with the models, and investigate the computational complexity of these problems. Next, we develop mathematical programming techniques for the optimization problems introduced, and apply them in presenting effective solution approaches to the problems. We present integer programming formulations for the optimization problems of interest, and provide a detailed study of the associated polytopes. Particularly, we develop valid inequalities and identify different classes of facets for the polytopes. Exact solution approaches developed for solving the problems include simple branch and bound, branch and cut, and combinatorial branch and bound algorithms. In addition, we introduce many preprocessing techniques and heuristics to enhance their performance. The presented algorithms are tested computationally on a number of graph instances, that include social networks and random graphs, to study the capability of the proposed solution methods. As a fitting conclusion to this work, we propose new techniques to get easily computable and strong upper bounds for the maximum clique problem. We investigate k-core and its stronger variant k-core/2-club in this light, and present minimization problems to get an upper bound on the maximization problems. Simple linear programming relaxations are developed and strengthened by valid inequalities, which are then compared with some standard relaxations from the literature. We present a detailed study of our computational results on a number of benchmark instances to test the effectiveness of our technique for getting good upper bounds

    Construction of near-optimal vertex clique covering for real-world networks

    Get PDF
    We propose a method based on combining a constructive and a bounding heuristic to solve the vertex clique covering problem (CCP), where the aim is to partition the vertices of a graph into the smallest number of classes, which induce cliques. Searching for the solution to CCP is highly motivated by analysis of social and other real-world networks, applications in graph mining, as well as by the fact that CCP is one of the classical NP-hard problems. Combining the construction and the bounding heuristic helped us not only to find high-quality clique coverings but also to determine that in the domain of real-world networks, many of the obtained solutions are optimal, while the rest of them are near-optimal. In addition, the method has a polynomial time complexity and shows much promise for its practical use. Experimental results are presented for a fairly representative benchmark of real-world data. Our test graphs include extracts of web-based social networks, including some very large ones, several well-known graphs from network science, as well as coappearance networks of literary works' characters from the DIMACS graph coloring benchmark. We also present results for synthetic pseudorandom graphs structured according to the Erdös-Renyi model and Leighton's model

    Partitioning networks into cliques: a randomized heuristic approach

    Get PDF
    In the context of community detection in social networks, the term community can be grounded in the strict way that simply everybody should know each other within the community. We consider the corresponding community detection problem. We search for a partitioning of a network into the minimum number of non-overlapping cliques, such that the cliques cover all vertices. This problem is called the clique covering problem (CCP) and is one of the classical NP-hard problems. For CCP, we propose a randomized heuristic approach. To construct a high quality solution to CCP, we present an iterated greedy (IG) algorithm. IG can also be combined with a heuristic used to determine how far the algorithm is from the optimum in the worst case. Randomized local search (RLS) for maximum independent set was proposed to find such a bound. The experimental results of IG and the bounds obtained by RLS indicate that IG is a very suitable technique for solving CCP in real-world graphs. In addition, we summarize our basic rigorous results, which were developed for analysis of IG and understanding of its behavior on several relevant graph classes

    Distance-generalized Core Decomposition

    Full text link
    The kk-core of a graph is defined as the maximal subgraph in which every vertex is connected to at least kk other vertices within that subgraph. In this work we introduce a distance-based generalization of the notion of kk-core, which we refer to as the (k,h)(k,h)-core, i.e., the maximal subgraph in which every vertex has at least kk other vertices at distance ≤h\leq h within that subgraph. We study the properties of the (k,h)(k,h)-core showing that it preserves many of the nice features of the classic core decomposition (e.g., its connection with the notion of distance-generalized chromatic number) and it preserves its usefulness to speed-up or approximate distance-generalized notions of dense structures, such as hh-club. Computing the distance-generalized core decomposition over large networks is intrinsically complex. However, by exploiting clever upper and lower bounds we can partition the computation in a set of totally independent subcomputations, opening the door to top-down exploration and to multithreading, and thus achieving an efficient algorithm

    Towards Structural Classification of Proteins based on Contact Map Overlap

    Get PDF
    A multitude of measures have been proposed to quantify the similarity between protein 3-D structure. Among these measures, contact map overlap (CMO) maximization deserved sustained attention during past decade because it offers a fine estimation of the natural homology relation between proteins. Despite this large involvement of the bioinformatics and computer science community, the performance of known algorithms remains modest. Due to the complexity of the problem, they got stuck on relatively small instances and are not applicable for large scale comparison. This paper offers a clear improvement over past methods in this respect. We present a new integer programming model for CMO and propose an exact B &B algorithm with bounds computed by solving Lagrangian relaxation. The efficiency of the approach is demonstrated on a popular small benchmark (Skolnick set, 40 domains). On this set our algorithm significantly outperforms the best existing exact algorithms, and yet provides lower and upper bounds of better quality. Some hard CMO instances have been solved for the first time and within reasonable time limits. From the values of the running time and the relative gap (relative difference between upper and lower bounds), we obtained the right classification for this test. These encouraging result led us to design a harder benchmark to better assess the classification capability of our approach. We constructed a large scale set of 300 protein domains (a subset of ASTRAL database) that we have called Proteus 300. Using the relative gap of any of the 44850 couples as a similarity measure, we obtained a classification in very good agreement with SCOP. Our algorithm provides thus a powerful classification tool for large structure databases

    Updating and downdating techniques for optimizing network communicability

    Get PDF
    The total communicability of a network (or graph) is defined as the sum of the entries in the exponential of the adjacency matrix of the network, possibly normalized by the number of nodes. This quantity offers a good measure of how easily information spreads across the network, and can be useful in the design of networks having certain desirable properties. The total communicability can be computed quickly even for large networks using techniques based on the Lanczos algorithm. In this work we introduce some heuristics that can be used to add, delete, or rewire a limited number of edges in a given sparse network so that the modified network has a large total communicability. To this end, we introduce new edge centrality measures which can be used to guide in the selection of edges to be added or removed. Moreover, we show experimentally that the total communicability provides an effective and easily computable measure of how "well-connected" a sparse network is.Comment: 20 pages, 9 pages Supplementary Materia
    • …
    corecore