26 research outputs found

    The Dynamics of Dynamic Variable Ordering Heuristics

    No full text
    . It has long been accepted that dynamic variable ordering heuristics outperform static orderings. But just how dynamic are dynamic variable ordering heuristics? This paper examines the behaviour of a number of heuristics, and attempts to measure the entropy of the search process at different depths in the search tree. 1 Introduction Many studies have shown that dynamic variable ordering (dvo [9]) heuristics out perform static variable ordering heuristics. But just how dynamic are dynamic variable ordering heuristics? This might be important because if we discover that some dvo heuristic H 1 results in less search effort than heuristic H 2 and H 1 is more dynamic than H 2 then we might expect that we can make a further improvement by increasing the dynamism of H 1 . Conversely if we discover that H 1 is better and less dynamic then we might plan to make H 1 even more ponderous. But how do we measure the dynamism of a heuristic? To investigate this we first look inside the search proc..

    Number of frequent patterns in random databases

    No full text
    Abstract In a tabular database, patterns which occur over a frequency threshold are called frequent patterns. They are central in numerous data processes and various efficient algorithms were recently designed for mining them. Unfortunately, very few is known about the real difficulty of this mining, which is closely related to the number of frequent patterns. The worst case analysis always leads to an exponential number of frequent patterns, but experimentations show that algorithms become efficient for reasonable frequency thresholds. We perform here a probabilistic analysis of the number of frequent patterns. We first introduce a general model of random databases that encompasses all the previous classical models. In this model, the rows of the database are seen as independent words generated by the same probabilistic source [i.e. a random process that emits symbols]. Under natural conditions on the source, the average number of frequent patterns is studied for various frequency thresholds. Then, we exhibit a large class of sources, the class of dynamical sources, which is proven to satisfy our general conditions. This finally shows that our results hold in a quite general context of random databases

    Average time analysis of clause order backtracking

    No full text
    SIGLETIB: RO 9630(90020) / FIZ - Fachinformationszzentrum Karlsruhe / TIB - Technische InformationsbibliothekDEGerman

    Fast Estimation of the Pattern Frequency Spectrum

    No full text
    Both exact and approximate counting of the number of frequent patterns for a given frequency threshold are hard problems. Still, having even coarse prior estimates of the number of patterns is useful, as these can be used to appropriately set the threshold and avoid waiting endlessly for an unmanageable number of patterns. Moreover, we argue that the number of patterns for different thresholds is an interesting summary statistic of the data: the pattern frequency spectrum. To enable fast estimation of the number of frequent patterns, we adapt the classical algorithm by Knuth for estimating the size of a search tree. Although the method is known to be theoretically suboptimal, we demonstrate that in practice it not only produces very accurate estimates, but is also very efficient. Moreover, we introduce a small variation that can be used to estimate the number of patterns under constraints for which the Apriori property does not hold. The empirical evaluation shows that this approach obtains good estimates for closed itemsets. Finally, we show how the method, together with isotonic regression, can be used to quickly and accurately estimate the frequency pattern spectrum: the curve that shows the number of patterns for every possible value of the frequency threshold. Comparing such a spectrum to one that was constructed using a random data model immediately reveals whether the dataset contains any structure of interest.status: publishe

    Solving SAT for CNF formulas with a one-sided restriction on variable occurrences

    No full text
    In this paper we consider the class of boolean formulas in Conjunctive Normal Form (CNF) where for each variable all but at most d occurrences are either positive or negative. This class is a generalization of the class of CNF formulas with at most d occurrences (positive and negative) of each variable which was studied in [Wahlström, 2005]. Applying complement search [Purdom, 1984], we show that for every d there exists a constant γd<2−12d+1 such that satisfiability of a CNF formula on n variables can be checked in runtime \ensuremathO(γnd) if all but at most d occurrences of each variable are either positive or negative. We thoroughly analyze the proposed branching strategy and determine the asymptotic growth constant γ d more precisely. Finally, we show that the trivial \ensuremathO(2n) barrier of satisfiability checking can be broken even for a more general class of formulas, namely formulas where the positive or negative literals of every variable have what we will call a d–covering. To the best of our knowledge, for the considered classes of formulas there are no previous non-trivial upper bounds on the complexity of satisfiability checking
    corecore