3,110 research outputs found

    All Maximal Independent Sets and Dynamic Dominance for Sparse Graphs

    Full text link
    We describe algorithms, based on Avis and Fukuda's reverse search paradigm, for listing all maximal independent sets in a sparse graph in polynomial time and delay per output. For bounded degree graphs, our algorithms take constant time per set generated; for minor-closed graph families, the time is O(n) per set, and for more general sparse graph families we achieve subquadratic time per set. We also describe new data structures for maintaining a dynamic vertex set S in a sparse or minor-closed graph family, and querying the number of vertices not dominated by S; for minor-closed graph families the time per update is constant, while it is sublinear for any sparse graph family. We can also maintain a dynamic vertex set in an arbitrary m-edge graph and test the independence of the maintained set in time O(sqrt m) per update. We use the domination data structures as part of our enumeration algorithms.Comment: 10 page

    Efficiently listing bounded length st-paths

    Full text link
    The problem of listing the KK shortest simple (loopless) stst-paths in a graph has been studied since the early 1960s. For a non-negatively weighted graph with nn vertices and mm edges, the most efficient solution is an O(K(mn+n2log⁥n))O(K(mn + n^2 \log n)) algorithm for directed graphs by Yen and Lawler [Management Science, 1971 and 1972], and an O(K(m+nlog⁥n))O(K(m+n \log n)) algorithm for the undirected version by Katoh et al. [Networks, 1982], both using O(Kn+m)O(Kn + m) space. In this work, we consider a different parameterization for this problem: instead of bounding the number of stst-paths output, we bound their length. For the bounded length parameterization, we propose new non-trivial algorithms matching the time complexity of the classic algorithms but using only O(m+n)O(m+n) space. Moreover, we provide a unified framework such that the solutions to both parameterizations -- the classic KK-shortest and the new length-bounded paths -- can be seen as two different traversals of a same tree, a Dijkstra-like and a DFS-like traversal, respectively.Comment: 12 pages, accepted to IWOCA 201

    Efficient enumeration of solutions produced by closure operations

    Full text link
    In this paper we address the problem of generating all elements obtained by the saturation of an initial set by some operations. More precisely, we prove that we can generate the closure of a boolean relation (a set of boolean vectors) by polymorphisms with a polynomial delay. Therefore we can compute with polynomial delay the closure of a family of sets by any set of "set operations": union, intersection, symmetric difference, subsets, supersets 
\dots). To do so, we study the MembershipFMembership_{\mathcal{F}} problem: for a set of operations F\mathcal{F}, decide whether an element belongs to the closure by F\mathcal{F} of a family of elements. In the boolean case, we prove that MembershipFMembership_{\mathcal{F}} is in P for any set of boolean operations F\mathcal{F}. When the input vectors are over a domain larger than two elements, we prove that the generic enumeration method fails, since MembershipFMembership_{\mathcal{F}} is NP-hard for some F\mathcal{F}. We also study the problem of generating minimal or maximal elements of closures and prove that some of them are related to well known enumeration problems such as the enumeration of the circuits of a matroid or the enumeration of maximal independent sets of a hypergraph. This article improves on previous works of the same authors.Comment: 30 pages, 1 figure. Long version of the article arXiv:1509.05623 of the same name which appeared in STACS 2016. Final version for DMTCS journa

    On maximal chain subgraphs and covers of bipartite graphs

    Get PDF
    In this paper, we address three related problems. One is the enumeration of all the maximal edge induced chain subgraphs of a bipartite graph, for which we provide a polynomial delay algorithm. We give bounds on the number of maximal chain subgraphs for a bipartite graph and use them to establish the input-sensitive complexity of the enumeration problem. The second problem we treat is the one of finding the minimum number of chain subgraphs needed to cover all the edges a bipartite graph. For this we provide an exact exponential algorithm with a non trivial complexity. Finally, we approach the problem of enumerating all minimal chain subgraph covers of a bipartite graph and show that it can be solved in quasi-polynomial time

    An Output Sensitive Algorithm for Maximal Clique Enumeration in Sparse Graphs

    Get PDF
    The degeneracy of a graph G is the smallest integer k such that every subgraph of G contains a vertex of degree at most k. Given an n-order k-degenerate graph G, we present an algorithm for enumerating all its maximal cliques. Assuming that c is the number of maximal cliques of G, our algorithm has setup time O(n(k^2+s(k+1))) and enumeration time cO((k+1)f(k+1)) where s(k+1) (resp. f(k+1)) is the preprocessing time (resp. enumeration time) for maximal clique enumeration in a general (k+1)-order graph. This is the first output sensitive algorithm whose enumeration time depends only on the degeneracy of the graph

    The Efficient Discovery of Interesting Closed Pattern Collections

    Get PDF
    Enumerating closed sets that are frequent in a given database is a fundamental data mining technique that is used, e.g., in the context of market basket analysis, fraud detection, or Web personalization. There are two complementing reasons for the importance of closed sets---one semantical and one algorithmic: closed sets provide a condensed basis for non-redundant collections of interesting local patterns, and they can be enumerated efficiently. For many databases, however, even the closed set collection can be way too large for further usage and correspondingly its computation time can be infeasibly long. In such cases, it is inevitable to focus on smaller collections of closed sets, and it is essential that these collections retain both: controlled semantics reflecting some notion of interestingness as well as efficient enumerability. This thesis discusses three different approaches to achieve this: constraint-based closed set extraction, pruning by quantifying the degree or strength of closedness, and controlled random generation of closed sets instead of exhaustive enumeration. For the original closed set family, efficient enumerability results from the fact that there is an inducing efficiently computable closure operator and that its fixpoints can be enumerated by an amortized polynomial number of closure computations. Perhaps surprisingly, it turns out that this connection does not generally hold for other constraint combinations, as the restricted domains induced by additional constraints can cause two things to happen: the fixpoints of the closure operator cannot be enumerated efficiently or an inducing closure operator does not even exist. This thesis gives, for the first time, a formal axiomatic characterization of constraint classes that allow to efficiently enumerate fixpoints of arbitrary closure operators as well as of constraint classes that guarantee the existence of a closure operator inducing the closed sets. As a complementary approach, the thesis generalizes the notion of closedness by quantifying its strength, i.e., the difference in supporting database records between a closed set and all its supersets. This gives rise to a measure of interestingness that is able to select long and thus particularly informative closed sets that are robust against noise and dynamic changes. Moreover, this measure is algorithmically sound because all closed sets with a minimum strength again form a closure system that can be enumerated efficiently and that directly ties into the results on constraint-based closed sets. In fact both approaches can easily be combined. In some applications, however, the resulting set of constrained closed sets is still intractably large or it is too difficult to find meaningful hard constraints at all (including values for their parameters). Therefore, the last part of this thesis presents an alternative algorithmic paradigm to the extraction of closed sets: instead of exhaustively listing a potentially exponential number of sets, randomly generate exactly the desired amount of them. By using the Markov chain Monte Carlo method, this generation can be performed according to any desired probability distribution that favors interesting patterns. This novel randomized approach complements traditional enumeration techniques (including those mentioned above): On the one hand, it is only applicable in scenarios that do not require deterministic guarantees for the output such as exploratory data analysis or global model construction. On the other hand, random closed set generation provides complete control over the number as well as the distribution of the produced sets.Das AufzĂ€hlen abgeschlossener Mengen (closed sets), die hĂ€ufig in einer gegebenen Datenbank vorkommen, ist eine algorithmische Grundaufgabe im Data Mining, die z.B. in Warenkorbanalyse, Betrugserkennung oder Web-Personalisierung auftritt. Die Wichtigkeit abgeschlossener Mengen ist semantisch als auch algorithmisch begrĂŒndet: Sie bilden eine nicht-redundante Basis zur Erzeugung von lokalen Mustern und können gleichzeitig effizient aufgezĂ€hlt werden. Allerdings kann die Anzahl aller abgeschlossenen Mengen, und damit ihre Auflistungszeit, das Maß des effektiv handhabbaren oft deutlich ĂŒbersteigen. In diesem Fall ist es unvermeidlich, kleinere Ausgabefamilien zu betrachten, und es ist essenziell, dass dabei beide o.g. Eigenschaften erhalten bleiben: eine kontrollierte Semantik im Sinne eines passenden Interessantheitsbegriffes sowie effiziente AufzĂ€hlbarkeit. Diese Arbeit stellt dazu drei AnsĂ€tze vor: das EinfĂŒhren zusĂ€tzlicher Constraints, die Quantifizierung der Abgeschlossenheit und die kontrollierte zufĂ€llige Erzeugung einzelner Mengen anstelle von vollstĂ€ndiger AufzĂ€hlung. Die effiziente AufzĂ€hlbarkeit der ursprĂŒnglichen Familie abgeschlossener Mengen rĂŒhrt daher, dass sie durch einen effizient berechenbaren Abschlussoperator erzeugt wird und dass desweiteren dessen Fixpunkte durch eine amortisiert polynomiell beschrĂ€nkte Anzahl von Abschlussberechnungen aufgezĂ€hlt werden können. Wie sich herausstellt ist dieser Zusammenhang im Allgemeinen nicht mehr gegeben, wenn die FunktionsdomĂ€ne durch Constraints einschrĂ€nkt wird, d.h., dass die effiziente AufzĂ€hlung der Fixpunkte nicht mehr möglich ist oder ein erzeugender Abschlussoperator unter UmstĂ€nden gar nicht existiert. Diese Arbeit gibt erstmalig eine axiomatische Charakterisierung von Constraint-Klassen, die die effiziente FixpunktaufzĂ€hlung von beliebigen Abschlussoperatoren erlauben, sowie von Constraint-Klassen, die die Existenz eines erzeugenden Abschlussoperators garantieren. Als ergĂ€nzenden Ansatz stellt die Dissertation eine Generalisierung bzw. Quantifizierung des Abgeschlossenheitsbegriffs vor, der auf der Differenz zwischen den Datenbankvorkommen einer Menge zu den Vorkommen all seiner Obermengen basiert. Mengen, die bezĂŒglich dieses Begriffes stark abgeschlossen sind, weisen eine bestimmte Robustheit gegen VerĂ€nderungen der Eingabedaten auf. Desweiteren wird die gewĂŒnschte effiziente AufzĂ€hlbarkeit wiederum durch die Existenz eines effizient berechenbaren erzeugenden Abschlussoperators sichergestellt. ZusĂ€tzlich zu dieser algorithmischen Parallele zum Constraint-basierten Vorgehen, können beide AnsĂ€tze auch inhaltlich kombiniert werden. In manchen Anwendungen ist die Familie der abgeschlossenen Mengen, zu denen die beiden oben genannten AnsĂ€tze fĂŒhren, allerdings immer noch zu groß bzw. ist es nicht möglich, sinnvolle harte Constraints und zugehörige Parameterwerte zu finden. Daher diskutiert diese Arbeit schließlich noch ein völlig anderes Paradigma zur Erzeugung abgeschlossener Mengen als vollstĂ€ndige Auflistung, nĂ€mlich die randomisierte Generierung einer Anzahl von Mengen, die exakt den gewĂŒnschten Vorgaben entspricht. Durch den Einsatz der Markov-Ketten-Monte-Carlo-Methode ist es möglich die Verteilung dieser Zufallserzeugung so zu steuern, dass das Ziehen interessanter Mengen begĂŒnstigt wird. Dieser neue Ansatz bildet eine sinnvolle ErgĂ€nzung zu herkömmlichen Techniken (einschließlich der oben genannten): Er ist zwar nur anwendbar, wenn keine deterministischen Garantien erforderlich sind, erlaubt aber andererseits eine vollstĂ€ndige Kontrolle ĂŒber Anzahl und Verteilung der produzierten Mengen

    Generating vertices of polyhedra and related problems of monotone generation

    Full text link

    Algorithms for the quantitative Lock/Key model of cytoplasmic incompatibility

    Get PDF
    Cytoplasmic incompatibility (CI) relates to the manipulation by the parasite Wolbachia of its host reproduction. Despite its widespread occurrence, the molecular basis of CI remains unclear and theoretical models have been proposed to understand the phenomenon. We consider in this paper the quantitative Lock-Key model which currently represents a good hypothesis that is consistent with the data available. CI is in this case modelled as the problem of covering the edges of a bipartite graph with the minimum number of chain subgraphs. This problem is already known to be NP-hard, and we provide an exponential algorithm with a non trivial complexity. It is frequent that depending on the dataset, there may be many optimal solutions which can be biologically quite different among them. To rely on a single optimal solution may therefore be problematic. To this purpose, we address the problem of enumerating (listing) all minimal chain subgraph covers of a bipartite graph and show that it can be solved in quasi-polynomial time. Interestingly, in order to solve the above problems, we considered also the problem of enumerating all the maximal chain subgraphs of a bipartite graph and improved on the current results in the literature for the latter. Finally, to demonstrate the usefulness of our methods we show an application on a real dataset
    • 

    corecore