3,672 research outputs found
Interpretable multiclass classification by MDL-based rule lists
Interpretable classifiers have recently witnessed an increase in attention
from the data mining community because they are inherently easier to understand
and explain than their more complex counterparts. Examples of interpretable
classification models include decision trees, rule sets, and rule lists.
Learning such models often involves optimizing hyperparameters, which typically
requires substantial amounts of data and may result in relatively large models.
In this paper, we consider the problem of learning compact yet accurate
probabilistic rule lists for multiclass classification. Specifically, we
propose a novel formalization based on probabilistic rule lists and the minimum
description length (MDL) principle. This results in virtually parameter-free
model selection that naturally allows to trade-off model complexity with
goodness of fit, by which overfitting and the need for hyperparameter tuning
are effectively avoided. Finally, we introduce the Classy algorithm, which
greedily finds rule lists according to the proposed criterion. We empirically
demonstrate that Classy selects small probabilistic rule lists that outperform
state-of-the-art classifiers when it comes to the combination of predictive
performance and interpretability. We show that Classy is insensitive to its
only parameter, i.e., the candidate set, and that compression on the training
set correlates with classification performance, validating our MDL-based
selection criterion
Hyper-heuristic decision tree induction
A hyper-heuristic is any algorithm that searches or operates in the space of
heuristics as opposed to the space of solutions. Hyper-heuristics are
increasingly used in function and combinatorial optimization. Rather than
attempt to solve a problem using a fixed heuristic, a hyper-heuristic
approach attempts to find a combination of heuristics that solve a problem
(and in turn may be directly suitable for a class of problem instances).
Hyper-heuristics have been little explored in data mining. This work presents
novel hyper-heuristic approaches to data mining, by searching a space of
attribute selection criteria for decision tree building algorithm. The search is
conducted by a genetic algorithm. The result of the hyper-heuristic search in
this case is a strategy for selecting attributes while building decision trees.
Most hyper-heuristics work by trying to adapt the heuristic to the state of
the problem being solved. Our hyper-heuristic is no different. It employs a
strategy for adapting the heuristic used to build decision tree nodes
according to some set of features of the training set it is working on. We
introduce, explore and evaluate five different ways in which this problem
state can be represented for a hyper-heuristic that operates within a decisiontree
building algorithm. In each case, the hyper-heuristic is guided by a rule
set that tries to map features of the data set to be split by the decision tree
building algorithm to a heuristic to be used for splitting the same data set.
We also explore and evaluate three different sets of low-level heuristics that
could be employed by such a hyper-heuristic.
This work also makes a distinction between specialist hyper-heuristics and
generalist hyper-heuristics. The main difference between these two hyperheuristcs
is the number of training sets used by the hyper-heuristic genetic
algorithm. Specialist hyper-heuristics are created using a single data set from
a particular domain for evolving the hyper-heurisic rule set. Such algorithms
are expected to outperform standard algorithms on the kind of data set used
by the hyper-heuristic genetic algorithm. Generalist hyper-heuristics are
trained on multiple data sets from different domains and are expected to
deliver a robust and competitive performance over these data sets when
compared to standard algorithms.
We evaluate both approaches for each kind of hyper-heuristic presented in
this thesis. We use both real data sets as well as synthetic data sets. Our
results suggest that none of the hyper-heuristics presented in this work are
suited for specialization – in most cases, the hyper-heuristic’s performance on
the data set it was specialized for was not significantly better than that of
the best performing standard algorithm. On the other hand, the generalist
hyper-heuristics delivered results that were very competitive to the best
standard methods. In some cases we even achieved a significantly better
overall performance than all of the standard methods
Separation of pulsar signals from noise with supervised machine learning algorithms
We evaluate the performance of four different machine learning (ML)
algorithms: an Artificial Neural Network Multi-Layer Perceptron (ANN MLP ),
Adaboost, Gradient Boosting Classifier (GBC), XGBoost, for the separation of
pulsars from radio frequency interference (RFI) and other sources of noise,
using a dataset obtained from the post-processing of a pulsar search pi peline.
This dataset was previously used for cross-validation of the SPINN-based
machine learning engine, used for the reprocessing of HTRU-S survey data
arXiv:1406.3627. We have used Synthetic Minority Over-sampling Technique
(SMOTE) to deal with high class imbalance in the dataset. We report a variety
of quality scores from all four of these algorithms on both the non-SMOTE and
SMOTE datasets. For all the above ML methods, we report high accuracy and
G-mean in both the non-SMOTE and SMOTE cases. We study the feature importances
using Adaboost, GBC, and XGBoost and also from the minimum Redundancy Maximum
Relevance approach to report algorithm-agnostic feature ranking. From these
methods, we find that the signal to noise of the folded profile to be the best
feature. We find that all the ML algorithms report FPRs about an order of
magnitude lower than the corresponding FPRs obtained in arXiv:1406.3627, for
the same recall value.Comment: 14 pages, 2 figures. Accepted for publication in Astronomy and
Computin
Social Dilemmas, Revisited from a Heuristics Perspective
The standard tool for analysing social dilemmas is game theory. They are reconstructed as prisoner dilemma games. This is helpful for understanding the incentive structure. Yet this analysis is based on the classic homo oeconomicus assumptions. In many real world dilemma situations, these assumptions are misleading. A case in point is the contribution of households to climate change. Decisions about using cars instead of public transport, or about extensive air conditioning, are typically not based on ad hoc calculation. Rather, individuals rely on situational heuristics for the purpose. This paper does two things: it offers a model of heuristics, in the interest of making behaviour that is guided by heuristics comparable to behaviour based on rational reasoning. Based on this model, the paper determines the implications for the definition of social dilemmas. In some contexts, the social dilemma vanishes. In other contexts, it must be understood, and hence solved, in substantially different ways.Heuristic, Social Dilemma, Public Good, Prisoner’s Dilemma
Optimal inference with suboptimal models:Addiction and active Bayesian inference
When casting behaviour as active (Bayesian) inference, optimal inference is defined with respect to an agent's beliefs - based on its generative model of the world. This contrasts with normative accounts of choice behaviour, in which optimal actions are considered in relation to the true structure of the environment - as opposed to the agent's beliefs about worldly states (or the task). This distinction shifts an understanding of suboptimal or pathological behaviour away from aberrant inference as such, to understanding the prior beliefs of a subject that cause them to behave less 'optimally' than our prior beliefs suggest they should behave. Put simply, suboptimal or pathological behaviour does not speak against understanding behaviour in terms of (Bayes optimal) inference, but rather calls for a more refined understanding of the subject's generative model upon which their (optimal) Bayesian inference is based. Here, we discuss this fundamental distinction and its implications for understanding optimality, bounded rationality and pathological (choice) behaviour. We illustrate our argument using addictive choice behaviour in a recently described 'limited offer' task. Our simulations of pathological choices and addictive behaviour also generate some clear hypotheses, which we hope to pursue in ongoing empirical work
After the Great Recession: Law and Economics\u27 Topics of Invention and Arrangement and Tropes of Style
AFTER THE GREAT RECESSION: LAW AND ECONOMICS’ TOPICS OF INVENTION AND ARRANGEMENT AND TROPES OF STYLE
by Michael D. Murray
Abstract
The Great Recession of 2008 and onward has drawn attention to the American economic and financial system, and has cast a critical spotlight on the theories, policies, and assumptions of the modern, neoclassical school of law and economics—often labeled the Chicago School —because this school of legal economic thought has had great influence on the American economy and financial system. The Chicago School\u27s positions on deregulation and the limitation or elimination of oversight and government restraints on stock markets, derivative markets, and other financial practices are the result of decades of neoclassical economic assumptions regarding the efficiency of unregulated markets, the near-religious-like devotion to a hyper-simplified conception of rationality and self-interest with regard to the persons and institutions participating in the financial system, and a conception of laws and government policies as incentives and costs in a manner that excludes the actual conditions and complications of reality.
This Article joins the critical conversation on the Great Recession and the role of law and economics in this crisis by examining neoclassical and contemporary law and economics from the perspective of legal rhetoric. Law and economics has developed into a school of contemporary legal rhetoric that provides topics of invention and arrangement and tropes of style to test and improve general legal discourse in areas beyond the economic analysis of law. The rhetorical canons of law and economics—mathematical and scientific methods of analysis and demonstration; the characterization of legal phenomena as incentives and costs; the rhetorical economic concept of efficiency; and rational choice theory as corrected by modern behavioral social sciences, cognitive studies, and brain science—make law and economics a persuasive method of legal analysis and a powerful school of contemporary legal rhetoric, if used in the right hands.
My Article is the first to examine the prescriptive implications of the rhetoric of law and economics for general legal discourse as opposed to examining the benefits and limitations of the economic analysis of law itself. This Article advances the conversation in two areas: first, as to the study and understanding of the persuasiveness of law and economics, particularly because that persuasiveness has played a role in influencing American economic and financial policy leading up to the Great Recession; and second, as to the study and understanding of the use of economic topics of invention and arrangement and tropes of style in general legal discourse when evaluated in comparison to the other schools of classical and contemporary legal rhetoric. I examine each of the rhetorical canons of law and economics and explain how each can be used to create meaning, inspire imagination, and improve the persuasiveness of legal discourse in every area of law. My conclusion is that the rhetorical canons of law and economics can be used to create meaning and inspire imagination in legal discourse beyond the economic analysis of law, but the canons are tools that only are as good as the user, and can be corrupted in ways that helped to bring about the current economic crisis
- …