1,505 research outputs found

    A Tight Approximation Algorithm for the Cluster Vertex Deletion Problem

    Full text link
    We give the first 22-approximation algorithm for the cluster vertex deletion problem. This is tight, since approximating the problem within any constant factor smaller than 22 is UGC-hard. Our algorithm combines the previous approaches, based on the local ratio technique and the management of true twins, with a novel construction of a 'good' cost function on the vertices at distance at most 22 from any vertex of the input graph. As an additional contribution, we also study cluster vertex deletion from the polyhedral perspective, where we prove almost matching upper and lower bounds on how well linear programming relaxations can approximate the problem.Comment: 23 pages, 3 figure

    Structural Rounding: Approximation Algorithms for Graphs Near an Algorithmically Tractable Class

    Get PDF
    We develop a framework for generalizing approximation algorithms from the structural graph algorithm literature so that they apply to graphs somewhat close to that class (a scenario we expect is common when working with real-world networks) while still guaranteeing approximation ratios. The idea is to edit a given graph via vertex- or edge-deletions to put the graph into an algorithmically tractable class, apply known approximation algorithms for that class, and then lift the solution to apply to the original graph. We give a general characterization of when an optimization problem is amenable to this approach, and show that it includes many well-studied graph problems, such as Independent Set, Vertex Cover, Feedback Vertex Set, Minimum Maximal Matching, Chromatic Number, (l-)Dominating Set, Edge (l-)Dominating Set, and Connected Dominating Set. To enable this framework, we develop new editing algorithms that find the approximately-fewest edits required to bring a given graph into one of a few important graph classes (in some cases these are bicriteria algorithms which simultaneously approximate both the number of editing operations and the target parameter of the family). For bounded degeneracy, we obtain an O(r log{n})-approximation and a bicriteria (4,4)-approximation which also extends to a smoother bicriteria trade-off. For bounded treewidth, we obtain a bicriteria (O(log^{1.5} n), O(sqrt{log w}))-approximation, and for bounded pathwidth, we obtain a bicriteria (O(log^{1.5} n), O(sqrt{log w} * log n))-approximation. For treedepth 2 (related to bounded expansion), we obtain a 4-approximation. We also prove complementary hardness-of-approximation results assuming P != NP: in particular, these problems are all log-factor inapproximable, except the last which is not approximable below some constant factor 2 (assuming UGC)

    Online Steiner Tree with Deletions

    Full text link
    In the online Steiner tree problem, the input is a set of vertices that appear one-by-one, and we have to maintain a Steiner tree on the current set of vertices. The cost of the tree is the total length of edges in the tree, and we want this cost to be close to the cost of the optimal Steiner tree at all points in time. If we are allowed to only add edges, a tight bound of Θ(logn)\Theta(\log n) on the competitiveness is known. Recently it was shown that if we can add one new edge and make one edge swap upon every vertex arrival, we can maintain a constant-competitive tree online. But what if the set of vertices sees both additions and deletions? Again, we would like to obtain a low-cost Steiner tree with as few edge changes as possible. The original paper of Imase and Waxman had also considered this model, and it gave a greedy algorithm that maintained a constant-competitive tree online, and made at most O(n3/2)O(n^{3/2}) edge changes for the first nn requests. In this paper give the following two results. Our first result is an online algorithm that maintains a Steiner tree only under deletions: we start off with a set of vertices, and at each time one of the vertices is removed from this set: our Steiner tree no longer has to span this vertex. We give an algorithm that changes only a constant number of edges upon each request, and maintains a constant-competitive tree at all times. Our algorithm uses the primal-dual framework and a global charging argument to carefully make these constant number of changes. We then study the natural greedy algorithm proposed by Imase and Waxman that maintains a constant-competitive Steiner tree in the fully-dynamic model (where each request either adds or deletes a vertex). Our second result shows that this algorithm makes only a constant number of changes per request in an amortized sense.Comment: An extended abstract appears in the SODA 2014 conferenc

    Malware Classification based on Call Graph Clustering

    Full text link
    Each day, anti-virus companies receive tens of thousands samples of potentially harmful executables. Many of the malicious samples are variations of previously encountered malware, created by their authors to evade pattern-based detection. Dealing with these large amounts of data requires robust, automatic detection approaches. This paper studies malware classification based on call graph clustering. By representing malware samples as call graphs, it is possible to abstract certain variations away, and enable the detection of structural similarities between samples. The ability to cluster similar samples together will make more generic detection techniques possible, thereby targeting the commonalities of the samples within a cluster. To compare call graphs mutually, we compute pairwise graph similarity scores via graph matchings which approximately minimize the graph edit distance. Next, to facilitate the discovery of similar malware samples, we employ several clustering algorithms, including k-medoids and DBSCAN. Clustering experiments are conducted on a collection of real malware samples, and the results are evaluated against manual classifications provided by human malware analysts. Experiments show that it is indeed possible to accurately detect malware families via call graph clustering. We anticipate that in the future, call graphs can be used to analyse the emergence of new malware families, and ultimately to automate implementation of generic detection schemes.Comment: This research has been supported by TEKES - the Finnish Funding Agency for Technology and Innovation as part of its ICT SHOK Future Internet research programme, grant 40212/0

    Lossy Kernelization for (Implicit) Hitting Set Problems

    Get PDF
    We re-visit the complexity of polynomial time pre-processing (kernelization) for the d-Hitting Set problem. This is one of the most classic problems in Parameterized Complexity by itself, and, furthermore, it encompasses several other of the most well-studied problems in this field, such as Vertex Cover, Feedback Vertex Set in Tournaments (FVST) and Cluster Vertex Deletion (CVD). In fact, d-Hitting Set encompasses any deletion problem to a hereditary property that can be characterized by a finite set of forbidden induced subgraphs. With respect to bit size, the kernelization complexity of d-Hitting Set is essentially settled: there exists a kernel with ?(k^d) bits (?(k^d) sets and ?(k^{d-1}) elements) and this it tight by the result of Dell and van Melkebeek [STOC 2010, JACM 2014]. Still, the question of whether there exists a kernel for d-Hitting Set with fewer elements has remained one of the most major open problems in Kernelization. In this paper, we first show that if we allow the kernelization to be lossy with a qualitatively better loss than the best possible approximation ratio of polynomial time approximation algorithms, then one can obtain kernels where the number of elements is linear for every fixed d. Further, based on this, we present our main result: we show that there exist approximate Turing kernelizations for d-Hitting Set that even beat the established bit-size lower bounds for exact kernelizations - in fact, we use a constant number of oracle calls, each with "near linear" (?(k^{1+?})) bit size, that is, almost the best one could hope for. Lastly, for two special cases of implicit 3-Hitting set, namely, FVST and CVD, we obtain the "best of both worlds" type of results - (1+?)-approximate kernelizations with a linear number of vertices. In terms of size, this substantially improves the exact kernels of Fomin et al. [SODA 2018, TALG 2019], with simpler arguments

    Distributed Edge Connectivity in Sublinear Time

    Full text link
    We present the first sublinear-time algorithm for a distributed message-passing network sto compute its edge connectivity λ\lambda exactly in the CONGEST model, as long as there are no parallel edges. Our algorithm takes O~(n11/353D1/353+n11/706)\tilde O(n^{1-1/353}D^{1/353}+n^{1-1/706}) time to compute λ\lambda and a cut of cardinality λ\lambda with high probability, where nn and DD are the number of nodes and the diameter of the network, respectively, and O~\tilde O hides polylogarithmic factors. This running time is sublinear in nn (i.e. O~(n1ϵ)\tilde O(n^{1-\epsilon})) whenever DD is. Previous sublinear-time distributed algorithms can solve this problem either (i) exactly only when λ=O(n1/8ϵ)\lambda=O(n^{1/8-\epsilon}) [Thurimella PODC'95; Pritchard, Thurimella, ACM Trans. Algorithms'11; Nanongkai, Su, DISC'14] or (ii) approximately [Ghaffari, Kuhn, DISC'13; Nanongkai, Su, DISC'14]. To achieve this we develop and combine several new techniques. First, we design the first distributed algorithm that can compute a kk-edge connectivity certificate for any k=O(n1ϵ)k=O(n^{1-\epsilon}) in time O~(nk+D)\tilde O(\sqrt{nk}+D). Second, we show that by combining the recent distributed expander decomposition technique of [Chang, Pettie, Zhang, SODA'19] with techniques from the sequential deterministic edge connectivity algorithm of [Kawarabayashi, Thorup, STOC'15], we can decompose the network into a sublinear number of clusters with small average diameter and without any mincut separating a cluster (except the `trivial' ones). Finally, by extending the tree packing technique from [Karger STOC'96], we can find the minimum cut in time proportional to the number of components. As a byproduct of this technique, we obtain an O~(n)\tilde O(n)-time algorithm for computing exact minimum cut for weighted graphs.Comment: Accepted at 51st ACM Symposium on Theory of Computing (STOC 2019
    corecore