124 research outputs found

    Transformational query solving

    Get PDF

    What\u27s So Special About Kruskal\u27s Theorem and the Ordinal \u3cem\u3eT\u3c/em\u3e\u3csub\u3eo\u3c/sub\u3e? A Survey of Some Results in Proof Theory

    Get PDF
    This paper consists primarily of a survey of results of Harvey Friedman about some proof theoretic aspects of various forms of Krusal\u27s tree theorem, and in particular the connection with the ordinal Ƭo. We also include a fairly extensive treatment of normal functions on the countable ordinals, and we give a glimpse of Veblen Hierarchies, some subsystems of second-order logic, slow-growing and fast-growing hierarchies including Girard\u27s result, and Goodstein sequences. The central theme of this paper is a powerful theorem due to Kruskal, the tree theorem , as well as a finite miniaturization of Kruskal\u27s theorem due to Harvey Friedman. These versions of Kruskal\u27s theorem are remarkable from a proof-theoretic point of view because they are not provable in relatively strong logical systems. They are examples of so-called natural independence phenomena , which are considered by more logicians as more natural than the mathematical incompleteness results first discovered by Gödel. Kruskal\u27s tree theorem also plays a fundamental role in computer science, because it is one of the main tools for showing that certain orderings on trees are well founded. These orderings play a crucial role in proving the termination of systems of rewrite rules and the correctness of Knuth-Bandix completion procedures. There is also a close connection between a certain infinite countable ordinal called Ƭoand Kruskal\u27s theorem. Previous definitions of the function involved in this connection are known to be incorrect, in that, the function is not monotonic. We offer a repaired definition of this function, and explore briefly the consequences of its existence

    High Performance Sparse Multivariate Polynomials: Fundamental Data Structures and Algorithms

    Get PDF
    Polynomials may be represented sparsely in an effort to conserve memory usage and provide a succinct and natural representation. Moreover, polynomials which are themselves sparse – have very few non-zero terms – will have wasted memory and computation time if represented, and operated on, densely. This waste is exacerbated as the number of variables increases. We provide practical implementations of sparse multivariate data structures focused on data locality and cache complexity. We look to develop high-performance algorithms and implementations of fundamental polynomial operations, using these sparse data structures, such as arithmetic (addition, subtraction, multiplication, and division) and interpolation. We revisit a sparse arithmetic scheme introduced by Johnson in 1974, adapting and optimizing these algorithms for modern computer architectures, with our implementations over the integers and rational numbers vastly outperforming the current wide-spread implementations. We develop a new algorithm for sparse pseudo-division based on the sparse polynomial division algorithm, with very encouraging results. Polynomial interpolation is explored through univariate, dense multivariate, and sparse multivariate methods. Arithmetic and interpolation together form a solid high-performance foundation from which many higher-level and more interesting algorithms can be built

    Program Verification Using Polynomials Over Modular Arithmetic

    Get PDF
    As program verification has matured as a discipline, so distinct topics have emerged and then developed into thriving sub-disciplines, each with their own language and focus. In Satisfiability Modulo Theories (SMT) solving the focus is on deciding the satisfiability of formulae over predicates (constraints) drawn from a background theory. If a SMT formula encodes the existence of a problematic path through a program, then a model of the formula will expose a fault as demonstrated with a counter-example. In abstract interpretation, on the other hand, the objective is typically to infer invariants for a program so as to demonstrate the absence of a fault. These complementary sub-disciplines do not exist in silos completing against one another: one sub-discipline informs the other. This thesis illustrates how these sub-disciplines cross-fertilise in both directions: presenting a new abstract domain that draws on techniques from SMT solving, namely solving systems of symbolic equations (theory solving). One fundamental operation used in the domain construction applies a propagation technique that suggests how the satisfiability the SMT formulae can be reduced to that of deciding the satisfiability of a compact SAT instance. This leads to a new technique for SMT solving. Although developed in tandem, for sake of presentation the thesis first addresses the satisfiability of systems of polynomial equations over bit-vectors. Instead of conventional bit-blasting, we exploit word-level inference to translate these systems into non-linear pseudo-boolean constraints. We derive the pseudo-booleans by simulating bit assignments through the addition of (linear) polynomials and applying a strong form of propagation by computing Gröbner bases, which provide an analog of a triangular form for systems of polynomials. By handling bit assignments symbolically, the number of Gröbner basis calculations, along with the number of assignments, is reduced. The final Gröbner basis yields an assignment to the bit-vectors, expressed parametrically in terms of the symbolic bits, together with non-linear pseudo-boolean constraints on the symbolic variables, modulo a power of two. The pseudo-booleans can be solved by translation into classical linear pseudo-boolean constraints (without a modulo) or by encoding them as propositional formulae, for which a novel translation process is described. This aspect of the thesis has a practical bias. The dual theme of the thesis on abstract domain construction has a theoretical bias. The thesis presents MPAD, the modulo polynomial abstract domain, whose invariants are systems of polynomial equations that hold modulo 2 to the power of ω, where ω is bit-width. MPAD systems over d variables symbolically represent sets of points in d-dimensional space as their solutions, and provide a way of representing and inferring polynomial invariants in the presence of wrap-around arithmetic. The domain operations of MPAD are computed using Gröbner bases, but are founded on a closure operation, mirroring a construction familiar in numeric abstraction. Given an input system of polynomials, and their associated solutions, closure derives a finite polynomial representation of all polynomials that satisfy these solutions. Closure is necessary for faithfully computing join and projection, operations that preserve it. Meet does not maintain closure, hence the need for an algorithm for computing it. Unlike convention polynomial abstraction, MPAD satisfies the ascending chain condition, finessing the need for widening. It also remedies the disparity in handling of equality but not disequality in guards, normally found in numeric abstraction: the structure of MPAD allowing the addition of a single polynomial disequality to be reexpressed using closure and join. We demonstrate that MPAD can derive invariants necessary for verifying the correctness of algorithms which exploit integrality, that were previously out of reach. As a whole, the thesis makes contributions to SMT solving and abstract interpretation, two complementary themes of program verification, both of which draw on common techniques from algebraic computation, namely Gröbner bases

    A stochastic SIS epidemic model with heterogeneous contacts.

    Get PDF
    A stochastic model for the spread of an SIS epidemic among a population consisting of N individuals, each having heterogeneous infectiousness and/or susceptibility, is considered and its behavior is analyzed under the practically relevant situation when N is small. The model is formulated as a finite time-homogeneous continuous-time Markov chain X. Based on an appropriate labeling of states, we first construct its infinitesimal rate matrix by using an iterative argument, and we then present an algorithmic procedure for computing steady-state measures, such as the number of infected individuals, the length of an outbreak, the maximum number of infectives, and the number of infections suffered by a marked individual during an outbreak. The time till the epidemic extinction is characterized as a phase-type random variable when there is no external source of infection, and its Laplace-Stieltjes transform and moments are derived in terms of a forward elimination backward substitution solution. The inverse iteration method is applied to the quasi-stationary distribution of X, which provides a good approximation of the process X at a certain time, conditional on non-extinction, after a suitable waiting time. The basic reproduction number R0 is defined here as a random variable, rather than an expected value

    Exact Tests via Complete Enumeration: A Distributed Computing Approach

    No full text
    The analysis of categorical data often leads to the analysis of a contingency table. For large samples, asymptotic approximations are sufficient when calculating p-values, but for small samples the tests can be unreliable. In these situations an exact test should be considered. This bases the test on the exact distribution of the test statistic. Sampling techniques can be used to estimate the distribution. Alternatively, the distribution can be found by complete enumeration. A new algorithm is developed that enables a model to be defined by a model matrix, and all tables that satisfy the model are found. This provides a more efficient enumeration mechanism for complex models and extends the range of models that can be tested. The technique can lead to large calculations and a distributed version of the algorithm is developed that enables a number of machines to work efficiently on the same problem

    Fundamentals and applications of order dependencies

    Get PDF
    Business-intelligence queries often involve SQL functions and algebraic expressions. There can be clear semantic relationships between a column's values and the values of a function over that column. A common property is monotonicity: as the column's values ascend, so do the function's values (or the other column's values). This we call an order dependency (OD). Queries can be evaluated more efficiently when the query optimizer uses order dependencies. They can be run even faster when the optimizer can also reason over known ODs to infer new ones. Order dependencies can be declared as integrity constraints, and they can be detected automatically for many types of SQL functions and algebraic expressions. We present optimization techniques using ODs for queries that involve join, order by, group by, partition by, and distinct. Essentially, ODs can further exploit interesting orders to eliminate or simplify potentially expensive sorts in the query plan. We evaluate these techniques over our prototype implementation in IBM® DB2® using the TPC-DS® benchmark schema and some customer inspired queries. Our experimental results demonstrate a significant performance gain. Dependencies have played an important role in database theory. We study the theoretical aspects of order dependencies-and unidirectional order dependencies (UODs), a proper sub-class of ODs-which describe the relationships among lexicographical orderings of sets of tuples. We investigate the inference problem for order dependencies. We establish the following: (i) a sound and complete axiomatization for UODs which is sound for ODs; (ii) a hierarchy of order dependency classes; (iii) a proof of co-NP-completeness of the inference problem for ODs and for the subclass of UODs; (iv) a proof of co-NP-completeness of the inference problem of functional dependencies (FDs) from ODs in general, but demonstrate linear time complexity for the inference of FDs from UODs; (v) a sound and complete elimination procedure for testing logical implication over ODs; and (vi) a sound and complete polynomial inference algorithm for sets of UODs over natural domains
    corecore