163 research outputs found
Discovering Regression and Classification Rules with Monotonic Constraints Using Ant Colony Optimization
Data mining is a broad area that encompasses many different tasks from the supervised classification and regression tasks to unsupervised association rule mining and clustering. A first research thread in this thesis is the introduction of new Ant Colony Optimization (ACO)-based algorithms that tackle the regression task in data mining, exploring three different learning strategies: Iterative Rule Learning, Pittsburgh and Michigan strategies. The Iterative Rule Learning strategy constructs one rule at a time, where the best rule created by the ant colony is added to the rule list at each iteration, until a complete rule list is created. In the Michigan strategy, each ant constructs a single rule and from this collection of rules a niching algorithm combines the rules to create the final rule list. Finally, in the Pittsburgh strategy each ant constructs an entire rule list at each iteration, with the best list constructed by an ant in any iteration representing the final model. The most successful Pittsburgh-based Ant-Miner-Reg_PB algorithm, among the three variants, has been shown to be competitive against a well-known regression rule induction algorithm from the literature. The second research thread pursued involved incorporating existing domain knowledge to guide the construction of models as it is rare to find new domains that nothing is known about. One type of domain knowledge that occurs frequently in real world data-sets is monotonic constraints which capture increasing or decreasing trends within the data. In this thesis, monotonic constraints have been introduced into ACO-based rule induction algorithms for both classification and regression tasks. The enforcement of monotonic constraints has been implemented as a two step process. The first is a soft constraint preference in the model construction phase. This is followed by a hard constraint post-processing pruning suite to ensure the production of monotonic models. The new algorithms presented here have been shown to maintain and improve their predictive power when compared to non-monotonic rule induction algorithms
Fairness-Aware Training of Decision Trees by Abstract Interpretation
International audienceWe study the problem of formally verifying individual fairness of decision tree ensembles, as well as training tree models which maximize both accuracy and individual fairness. In our approach, fairness verification and fairness-aware training both rely on a notion of stability of a classifier, which is a generalization of the standard notion of robustness to input perturbations used in adversarial machine learning. Our verification and training methods leverage abstract interpretation, a well-established mathematical framework for designing computable, correct, and precise approximations of potentially infinite behaviors. We implemented our fairness-aware learning method by building on a tool for adversarial training of decision trees. We evaluated it in practice on the reference datasets in the literature on fairness in machine learning. The experimental results show that our approach is able to train tree models exhibiting a high degree of individual fairness with respect to the natural state-of-the-art CART trees and random forests. Moreover, as a by-product, these fairness-aware decision trees turn out to be significantly compact, which naturally enhances their interpretability
Learning to predict under a budget
Prediction-time budgets in machine learning applications can arise due to monetary or computational costs associated with acquiring information; they also arise due to latency and power consumption costs in evaluating increasingly more complex models. The goal in such budgeted prediction problems is to learn decision systems that maintain high prediction accuracy while meeting average cost constraints during prediction-time. Such decision systems can potentially adapt to the input examples, predicting most of them at low cost while allocating more budget for the few "hard" examples.
In this thesis, I will present several learning methods to better trade-off cost and error during prediction. The conceptual contribution of this thesis is to develop a new paradigm of bottom-up approach instead of the traditional top-down approach. A top-down approach attempts to build out the model by selectively adding the most cost-effective features to improve accuracy. In contrast, a bottom-up approach first learns a highly accurate model and then prunes or adaptively approximates it to trade-off cost and error. Training top-down models in case of feature acquisition costs leads to fundamental combinatorial issues in multi-stage search over all feature subsets. In contrast, we show that the bottom-up methods bypass many of such issues.
To develop this theme, we first propose two top-down methods and then two bottom-up methods. The first top-down method uses margin information from training data in the partial feature neighborhood of a test point to either select the next best feature in a greedy fashion or to stop and make prediction.
The second top-down method is a variant of random forest (RF) algorithm. We grow decision trees with low acquisition cost and high strength based on greedy mini-max cost-weighted impurity splits. Theoretically, we establish near-optimal acquisition cost guarantees for our algorithm.
The first bottom-up method we propose is based on pruning RFs to optimize expected feature cost and accuracy. Given a RF as input, we pose pruning as a novel 0-1 integer program and show that it can be solved exactly via LP relaxation. We further develop a fast primal-dual algorithm that scales to large datasets. The second bottom-up method is adaptive approximation, which significantly generalizes the RF pruning to accommodate more models and other types of costs besides feature acquisition cost. We first train a high-accuracy, high-cost model. We then jointly learn a low-cost gating function together with a low-cost prediction model to adaptively approximate the high-cost model. The gating function identifies the regions of the input space where the low-cost model suffices for making highly accurate predictions.
We demonstrate empirical performance of these methods and compare them to the state-of-the-arts. Finally, we study adaptive approximation in the on-line setting to obtain regret guarantees and discuss future work.2019-07-02T00:00:00
Recommended from our members
Training Decision Trees for Optimal Decision-Making
Many analytics problems in Operations Research and the Management Sciences can be framed as decision-making problems containing uncertain input parameters to be estimated from data. For example, inventory optimization problems often require forecasts of future demand, and product recommendation systems (e.g., movies, sporting goods) depend on models for predicting customer responses to the feasible recommendations. Therefore, a question central to many analytics problems is how to optimally build models from data which estimate the uncertain inputs for the decision problems of interest. We argue that most common approaches for this task either (a) focus on the wrong objectives in training the models for the decision problem, or (b) focus on the right objectives but only study how to do so with prohibitively simple machine learning models (e.g. linear and logistic regression).
In this work, we study how to train decision tree models for predicting uncertain parameters for analytical decision-making problems. Unlike other machine learning models such as linear and logistic regression, decision trees are both nonparameteric and interpretable, allowing them the capability of modeling highly complex relationships between data and predictions while also being easily visualized and interpreted. We propose tractable algorithms for decision tree training in the context of three problem domains relevant to Operations Research. First, we study how to train decision trees for delivering real-time personalized recommendations of products in settings where little prior data is available for training purposes. This problem is known in the literature as the contextual bandit problem and requires careful navigation of the so-called "exploration-exploitation trade-off" in utilizing the decision tree models. Second, we propose a new framework which we call Market Segmentation Trees (MSTs) for training decision tree models for the purposes of market segmentation and personalization. We explore several applications of MSTs relevant to personalized advertising, including recommending hotels to Expedia users as a function of their search queries and segmenting ad auctions according to the distribution of bids that they receive. Finally, we propose a general framework for training decision tree models for uncertain optimization problems which we call "SPO Trees" (SPOTs). In contrast to the typical objective of maximizing predictive accuracy, the SPOT framework trains decision trees to maximize the quality of the solutions found in the uncertain optimization problem, therefore yielding better decisions in several analytics problems of interest
- …