24,912 research outputs found

    On Cognitive Preferences and the Plausibility of Rule-based Models

    Get PDF
    It is conventional wisdom in machine learning and data mining that logical models such as rule sets are more interpretable than other models, and that among such rule-based models, simpler models are more interpretable than more complex ones. In this position paper, we question this latter assumption by focusing on one particular aspect of interpretability, namely the plausibility of models. Roughly speaking, we equate the plausibility of a model with the likeliness that a user accepts it as an explanation for a prediction. In particular, we argue that, all other things being equal, longer explanations may be more convincing than shorter ones, and that the predominant bias for shorter models, which is typically necessary for learning powerful discriminative models, may not be suitable when it comes to user acceptance of the learned models. To that end, we first recapitulate evidence for and against this postulate, and then report the results of an evaluation in a crowd-sourcing study based on about 3.000 judgments. The results do not reveal a strong preference for simple rules, whereas we can observe a weak preference for longer rules in some domains. We then relate these results to well-known cognitive biases such as the conjunction fallacy, the representative heuristic, or the recogition heuristic, and investigate their relation to rule length and plausibility.Comment: V4: Another rewrite of section on interpretability to clarify focus on plausibility and relation to interpretability, comprehensibility, and justifiabilit

    Heuristics for high-utility local process model mining

    Get PDF
    Local Process Models (LPMs) describe structured fragments of process behavior occurring in the context of less structured business processes. In contrast to traditional support-based LPM discovery, which aims to generate a collection of process models that describe highly frequent behavior, High-Utility Local Process Model (HU-LPM) discovery aims to generate a collection of process models that provide useful business insights by specifying a utility function. Mining LPMs is a computationally expensive task, because of the large search space of LPMs. In supportbased LPM mining, the search space is constrained by making use of the property that support is anti-monotonic. We show that in general, we cannot assume a provided utility function to be anti-monotonic, therefore, the search space of HU-LPMs cannot be reduced without loss. We propose four heuristic methods to speed up the mining of HU-LPMs while still being able to discover useful HU-LPMs. We demonstrate their applicability on three real-life data sets

    Learning optimization models in the presence of unknown relations

    Full text link
    In a sequential auction with multiple bidding agents, it is highly challenging to determine the ordering of the items to sell in order to maximize the revenue due to the fact that the autonomy and private information of the agents heavily influence the outcome of the auction. The main contribution of this paper is two-fold. First, we demonstrate how to apply machine learning techniques to solve the optimal ordering problem in sequential auctions. We learn regression models from historical auctions, which are subsequently used to predict the expected value of orderings for new auctions. Given the learned models, we propose two types of optimization methods: a black-box best-first search approach, and a novel white-box approach that maps learned models to integer linear programs (ILP) which can then be solved by any ILP-solver. Although the studied auction design problem is hard, our proposed optimization methods obtain good orderings with high revenues. Our second main contribution is the insight that the internal structure of regression models can be efficiently evaluated inside an ILP solver for optimization purposes. To this end, we provide efficient encodings of regression trees and linear regression models as ILP constraints. This new way of using learned models for optimization is promising. As the experimental results show, it significantly outperforms the black-box best-first search in nearly all settings.Comment: 37 pages. Working pape

    Cooperation between expert knowledge and data mining discovered knowledge: Lessons learned

    Get PDF
    Expert systems are built from knowledge traditionally elicited from the human expert. It is precisely knowledge elicitation from the expert that is the bottleneck in expert system construction. On the other hand, a data mining system, which automatically extracts knowledge, needs expert guidance on the successive decisions to be made in each of the system phases. In this context, expert knowledge and data mining discovered knowledge can cooperate, maximizing their individual capabilities: data mining discovered knowledge can be used as a complementary source of knowledge for the expert system, whereas expert knowledge can be used to guide the data mining process. This article summarizes different examples of systems where there is cooperation between expert knowledge and data mining discovered knowledge and reports our experience of such cooperation gathered from a medical diagnosis project called Intelligent Interpretation of Isokinetics Data, which we developed. From that experience, a series of lessons were learned throughout project development. Some of these lessons are generally applicable and others pertain exclusively to certain project types

    GraphCombEx: A Software Tool for Exploration of Combinatorial Optimisation Properties of Large Graphs

    Full text link
    We present a prototype of a software tool for exploration of multiple combinatorial optimisation problems in large real-world and synthetic complex networks. Our tool, called GraphCombEx (an acronym of Graph Combinatorial Explorer), provides a unified framework for scalable computation and presentation of high-quality suboptimal solutions and bounds for a number of widely studied combinatorial optimisation problems. Efficient representation and applicability to large-scale graphs and complex networks are particularly considered in its design. The problems currently supported include maximum clique, graph colouring, maximum independent set, minimum vertex clique covering, minimum dominating set, as well as the longest simple cycle problem. Suboptimal solutions and intervals for optimal objective values are estimated using scalable heuristics. The tool is designed with extensibility in mind, with the view of further problems and both new fast and high-performance heuristics to be added in the future. GraphCombEx has already been successfully used as a support tool in a number of recent research studies using combinatorial optimisation to analyse complex networks, indicating its promise as a research software tool
    corecore