5 research outputs found

    Object-oriented data mining

    Get PDF
    EThOS - Electronic Theses Online ServiceGBUnited Kingdo

    Better Rulesets by Removing Redundant Specialisations and Generalisations in Association Rule Mining

    No full text
    Association rule mining is a fundamental task in many data mining and analysis applications, both for knowledge extraction and as part of other processes (for example, building associative classifiers). It is well known that the number of associations identified by many association rule mining algorithms can be so large as to present a barrier to their interpretability and practical use. A typical solution to this problem involves removing redundant rules. This paper proposes a novel definition of redundancy, which is used to identify only the most interesting associations. Compared to existing redundancy based approaches, our method is both more robust to noise, and produces fewer overall rules for a given data (improving clarity). A rule can be considered redundant if the knowledge it describes is already contained in other rules. Given an association rule, most existing approaches consider rules to be redundant if they add additional variables without increasing quality according to some measure of interestingness. We claim that complex interactions between variables can confound many interestingness measures. This can lead to existing approaches being overly aggressive in removing redundant associations. Most existing approaches also fail to take into account situations where more general rules (those with fewer attributes) can be considered redundant with respect to their specialisations. We examine this problem and provide concrete examples of such errors using artificial data. An alternate definition of redundancy that addresses these issues is proposed. Our approach is shown to identify interesting associations missed by comparable methods on multiple real and synthetic data. When combined with the removal of redundant generalisations, our approach is often able to generate smaller overall rule sets, while leaving average rule quality unaffected or slightly improved

    Combined decision procedures for nonlinear arithmetics, real and complex

    Get PDF
    We describe contributions to algorithmic proof techniques for deciding the satisfiability of boolean combinations of many-variable nonlinear polynomial equations and inequalities over the real and complex numbers. In the first half, we present an abstract theory of Grobner basis construction algorithms for algebraically closed fields of characteristic zero and use it to introduce and prove the correctness of Grobner basis methods tailored to the needs of modern satisfiability modulo theories (SMT) solvers. In the process, we use the technique of proof orders to derive a generalisation of S-polynomial superfluousness in terms of transfinite induction along an ordinal parameterised by a monomial order. We use this generalisation to prove the abstract (“strategy-independent”) admissibility of a number of superfluous S-polynomial criteria important for efficient basis construction. Finally, we consider local notions of proof minimality for weak Nullstellensatz proofs and give ideal-theoretic methods for computing complex “unsatisfiable cores” which contribute to efficient SMT solving in the context of nonlinear complex arithmetic. In the second half, we consider the problem of effectively combining a heterogeneous collection of decision techniques for fragments of the existential theory of real closed fields. We propose and investigate a number of novel combined decision methods and implement them in our proof tool RAHD (Real Algebra in High Dimensions). We build a hierarchy of increasingly powerful combined decision methods, culminating in a generalisation of partial cylindrical algebraic decomposition (CAD) which we call Abstract Partial CAD. This generalisation incorporates the use of arbitrary sound but possibly incomplete proof procedures for the existential theory of real closed fields as first-class functional parameters for “short-circuiting” expensive computations during the lifting phase of CAD. Identifying these proof procedure parameters formally with RAHD proof strategies, we implement the method in RAHD for the case of full-dimensional cell decompositions and investigate its efficacy with respect to the Brown-McCallum projection operator. We end with some wishes for the future

    Studies related to the process of program development

    Get PDF
    The submitted work consists of a collection of publications arising from research carried out at Rhodes University (1970-1980) and at Heriot-Watt University (1980-1992). The theme of this research is the process of program development, i.e. the process of creating a computer program to solve some particular problem. The papers presented cover a number of different topics which relate to this process, viz. (a) Programming methodology programming. (b) Properties of programming languages. aspects of structured. (c) Formal specification of programming languages. (d) Compiler techniques. (e) Declarative programming languages. (f) Program development aids. (g) Automatic program generation. (h) Databases. (i) Algorithms and applications
    corecore