5,542 research outputs found
Induction of Non-Monotonic Logic Programs to Explain Boosted Tree Models Using LIME
We present a heuristic based algorithm to induce \textit{nonmonotonic} logic
programs that will explain the behavior of XGBoost trained classifiers. We use
the technique based on the LIME approach to locally select the most important
features contributing to the classification decision. Then, in order to explain
the model's global behavior, we propose the LIME-FOLD algorithm ---a
heuristic-based inductive logic programming (ILP) algorithm capable of learning
non-monotonic logic programs---that we apply to a transformed dataset produced
by LIME. Our proposed approach is agnostic to the choice of the ILP algorithm.
Our experiments with UCI standard benchmarks suggest a significant improvement
in terms of classification evaluation metrics. Meanwhile, the number of induced
rules dramatically decreases compared to ALEPH, a state-of-the-art ILP system
(Psycho-)Analysis of Benchmark Experiments
It is common knowledge that certain characteristics of data sets -- such as linear separability or sample size -- determine the performance of learning algorithms. In this paper we propose a formal framework for investigations on this relationship.
The framework combines three, in their respective scientific discipline well-established, methods. Benchmark experiments are the method of choice in machine and statistical learning to compare algorithms with respect to a certain performance measure on particular data sets. To realize the interaction between data sets and algorithms, the data sets are characterized using statistical and information-theoretic measures; a common approach in the field of meta learning to decide which algorithms are suited to particular data sets. Finally, the performance ranking of algorithms on groups of data sets with similar characteristics is determined by means of recursively partitioning Bradley-Terry models, that are commonly used in psychology to study the preferences of human subjects. The result is a tree with splits in data set characteristics which significantly change the performances of the algorithms. The main advantage is the automatic detection of these important characteristics.
The framework is introduced using a simple artificial example. Its real-word usage is demonstrated by means of an application example consisting of thirteen well-known data sets and six common learning algorithms. All resources to replicate the examples are available online
Modelling decision tables from data.
On most datasets induction algorithms can generate very accurate classifiers. Sometimes, however, these classifiers are very hard to understand for humans. Therefore, in this paper it is investigated how we can present the extracted knowledge to the user by means of decision tables. Decision tables are very easy to understand. Furthermore, decision tables provide interesting facilities to check the extracted knowledge on consistency and completeness. In this paper, it is demonstrated how a consistent and complete DT can be modelled starting from raw data. The proposed method is empirically validated on several benchmarking datasets. It is shown that the modelling decision tables are sufficiently small. This allows easy consultation of the represented knowledge.Data;
Optimizing Coordinated Vehicle Platooning: An Analytical Approach Based on Stochastic Dynamic Programming
Platooning connected and autonomous vehicles (CAVs) can improve traffic and
fuel efficiency. However, scalable platooning operations require junction-level
coordination, which has not been well studied. In this paper, we study the
coordination of vehicle platooning at highway junctions. We consider a setting
where CAVs randomly arrive at a highway junction according to a general renewal
process. When a CAV approaches the junction, a system operator determines
whether the CAV will merge into the platoon ahead according to the positions
and speeds of the CAV and the platoon. We formulate a Markov decision process
to minimize the discounted cumulative travel cost, i.e. fuel consumption plus
travel delay, over an infinite time horizon. We show that the optimal policy is
threshold-based: the CAV will merge with the platoon if and only if the
difference between the CAV's and the platoon's predicted times of arrival at
the junction is less than a constant threshold. We also propose two
ready-to-implement algorithms to derive the optimal policy. Comparison with the
classical value iteration algorithm implies that our approach explicitly
incorporating the characteristics of the optimal policy is significantly more
efficient in terms of computation. Importantly, we show that the optimal policy
under Poisson arrivals can be obtained by solving a system of integral
equations. We also validate our results in simulation with Real-time Strategy
(RTS) using real traffic data. The simulation results indicate that the
proposed method yields better performance compared with the conventional
method
- …
