8,853 research outputs found

    Set-Oriented Mining for Association Rules in Relational Databases

    Get PDF
    Describe set-oriented algorithms for mining association rules. Such algorithms imply performing multiple joins and may appear to be inherently less efficient than special-purpose algorithms. We develop new algorithms that can be expressed as SQL queries, and discuss the optimization of these algorithms. After analytical evaluation, an algorithm named SETM emerges as the algorithm of choice. SETM uses only simple database primitives, viz. sorting and merge-scan join. SETM is simple, fast and stable over the range of parameter values. The major contribution of this paper is that it shows that at least some aspects of data mining can be carried out by using general query languages such as SQL, rather than by developing specialized black-box algorithms. The set-oriented nature of SETM facilitates the development of extension

    Computing Multi-Relational Sufficient Statistics for Large Databases

    Full text link
    Databases contain information about which relationships do and do not hold among entities. To make this information accessible for statistical analysis requires computing sufficient statistics that combine information from different database tables. Such statistics may involve any number of {\em positive and negative} relationships. With a naive enumeration approach, computing sufficient statistics for negative relationships is feasible only for small databases. We solve this problem with a new dynamic programming algorithm that performs a virtual join, where the requisite counts are computed without materializing join tables. Contingency table algebra is a new extension of relational algebra, that facilitates the efficient implementation of this M\"obius virtual join operation. The M\"obius Join scales to large datasets (over 1M tuples) with complex schemas. Empirical evaluation with seven benchmark datasets showed that information about the presence and absence of links can be exploited in feature selection, association rule mining, and Bayesian network learning.Comment: 11pages, 8 figures, 8 tables, CIKM'14,November 3--7, 2014, Shanghai, Chin

    Adjoint exactness

    Get PDF
    Plato's ideas and Aristotle's real types from the classical age, Nominalism and Realism of the mediaeval period and Whitehead's modern view of the world as pro- cess all come together in the formal representation by category theory of exactness in adjointness (a). Concepts of exactness and co-exactness arise naturally from ad- jointness and are needed in current global problems of science. If a right co-exact valued left-adjoint functor ( ) in a cartesian closed category has a right-adjoint left- exact functor ( ), then physical stability is satis ed if itself is also a right co-exact left-adjoint functor for the right-adjoint left exact functor ( ): a a . These concepts are discussed here with examples in nuclear fusion, in database interroga- tion and in the cosmological ne structure constant by the Frederick construction
    • ā€¦
    corecore