6,036 research outputs found

    A survey of cost-sensitive decision tree induction algorithms

    Get PDF
    The past decade has seen a significant interest on the problem of inducing decision trees that take account of costs of misclassification and costs of acquiring the features used for decision making. This survey identifies over 50 algorithms including approaches that are direct adaptations of accuracy based methods, use genetic algorithms, use anytime methods and utilize boosting and bagging. The survey brings together these different studies and novel approaches to cost-sensitive decision tree learning, provides a useful taxonomy, a historical timeline of how the field has developed and should provide a useful reference point for future research in this field

    A Delta Debugger for ILP Query Execution

    Full text link
    Because query execution is the most crucial part of Inductive Logic Programming (ILP) algorithms, a lot of effort is invested in developing faster execution mechanisms. These execution mechanisms typically have a low-level implementation, making them hard to debug. Moreover, other factors such as the complexity of the problems handled by ILP algorithms and size of the code base of ILP data mining systems make debugging at this level a very difficult job. In this work, we present the trace-based debugging approach currently used in the development of new execution mechanisms in hipP, the engine underlying the ACE Data Mining system. This debugger uses the delta debugging algorithm to automatically reduce the total time needed to expose bugs in ILP execution, thus making manual debugging step much lighter.Comment: Paper presented at the 16th Workshop on Logic-based Methods in Programming Environments (WLPE2006

    Efficient Generation of Craig Interpolants in Satisfiability Modulo Theories

    Full text link
    The problem of computing Craig Interpolants has recently received a lot of interest. In this paper, we address the problem of efficient generation of interpolants for some important fragments of first order logic, which are amenable for effective decision procedures, called Satisfiability Modulo Theory solvers. We make the following contributions. First, we provide interpolation procedures for several basic theories of interest: the theories of linear arithmetic over the rationals, difference logic over rationals and integers, and UTVPI over rationals and integers. Second, we define a novel approach to interpolate combinations of theories, that applies to the Delayed Theory Combination approach. Efficiency is ensured by the fact that the proposed interpolation algorithms extend state of the art algorithms for Satisfiability Modulo Theories. Our experimental evaluation shows that the MathSAT SMT solver can produce interpolants with minor overhead in search, and much more efficiently than other competitor solvers.Comment: submitted to ACM Transactions on Computational Logic (TOCL

    Combined optimization of feature selection and algorithm parameters in machine learning of language

    Get PDF
    Comparative machine learning experiments have become an important methodology in empirical approaches to natural language processing (i) to investigate which machine learning algorithms have the 'right bias' to solve specific natural language processing tasks, and (ii) to investigate which sources of information add to accuracy in a learning approach. Using automatic word sense disambiguation as an example task, we show that with the methodology currently used in comparative machine learning experiments, the results may often not be reliable because of the role of and interaction between feature selection and algorithm parameter optimization. We propose genetic algorithms as a practical approach to achieve both higher accuracy within a single approach, and more reliable comparisons

    JWalk: a tool for lazy, systematic testing of java classes by design introspection and user interaction

    Get PDF
    Popular software testing tools, such as JUnit, allow frequent retesting of modified code; yet the manually created test scripts are often seriously incomplete. A unit-testing tool called JWalk has therefore been developed to address the need for systematic unit testing within the context of agile methods. The tool operates directly on the compiled code for Java classes and uses a new lazy method for inducing the changing design of a class on the fly. This is achieved partly through introspection, using Java’s reflection capability, and partly through interaction with the user, constructing and saving test oracles on the fly. Predictive rules reduce the number of oracle values that must be confirmed by the tester. Without human intervention, JWalk performs bounded exhaustive exploration of the class’s method protocols and may be directed to explore the space of algebraic constructions, or the intended design state-space of the tested class. With some human interaction, JWalk performs up to the equivalent of fully automated state-based testing, from a specification that was acquired incrementally
    • …
    corecore