Search CORE

7,506 research outputs found

Multi-interval discretization methods for decision tree learning

Author: A. Harkonen
J.R. Quinlan
J.R. Quinlan
L. Breiman
R. Kerber
T.P. Huber
U.M. Fayyad
Y.K. Li
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

On the usage of the probability integral transform to reduce the complexity of multi-way fuzzy decision trees in Big Data classification problems

Author: Bustince Humberto
Elkano Mikel
Galar Mikel
Uriz Mikel
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 28/02/2019
Field of study

We present a new distributed fuzzy partitioning method to reduce the complexity of multi-way fuzzy decision trees in Big Data classification problems. The proposed algorithm builds a fixed number of fuzzy sets for all variables and adjusts their shape and position to the real distribution of training data. A two-step process is applied : 1) transformation of the original distribution into a standard uniform distribution by means of the probability integral transform. Since the original distribution is generally unknown, the cumulative distribution function is approximated by computing the q-quantiles of the training set; 2) construction of a Ruspini strong fuzzy partition in the transformed attribute space using a fixed number of equally distributed triangular membership functions. Despite the aforementioned transformation, the definition of every fuzzy set in the original space can be recovered by applying the inverse cumulative distribution function (also known as quantile function). The experimental results reveal that the proposed methodology allows the state-of-the-art multi-way fuzzy decision tree (FMDT) induction algorithm to maintain classification accuracy with up to 6 million fewer leaves.Comment: Appeared in 2018 IEEE International Congress on Big Data (BigData Congress). arXiv admin note: text overlap with arXiv:1902.0935

arXiv.org e-Print Archive

Crossref

Human gesture classification by brute-force machine learning for exergaming in physiotherapy

Author: Allebosch Gianni
Deboeverie Francis
Philips Wilfried
Roegiers Sanne
Veelaert Peter
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2017
Field of study

In this paper, a novel approach for human gesture classification on skeletal data is proposed for the application of exergaming in physiotherapy. Unlike existing methods, we propose to use a general classifier like Random Forests to recognize dynamic gestures. The temporal dimension is handled afterwards by majority voting in a sliding window over the consecutive predictions of the classifier. The gestures can have partially similar postures, such that the classifier will decide on the dissimilar postures. This brute-force classification strategy is permitted, because dynamic human gestures show sufficient dissimilar postures. Online continuous human gesture recognition can classify dynamic gestures in an early stage, which is a crucial advantage when controlling a game by automatic gesture recognition. Also, ground truth can be easily obtained, since all postures in a gesture get the same label, without any discretization into consecutive postures. This way, new gestures can be easily added, which is advantageous in adaptive game development. We evaluate our strategy by a leave-one-subject-out cross-validation on a self-captured stealth game gesture dataset and the publicly available Microsoft Research Cambridge-12 Kinect (MSRC-12) dataset. On the first dataset we achieve an excellent accuracy rate of 96.72%. Furthermore, we show that Random Forests perform better than Support Vector Machines. On the second dataset we achieve an accuracy rate of 98.37%, which is on average 3.57% better then existing methods

Crossref

Ghent University Academic Bibliography

Efficient Database Generation for Data-driven Security Assessment of Power Systems

Author: Chatzivasileiadis Spyros
Eriksson Robert
Thams Florian
Venzke Andreas
Publication venue
Publication date: 01/02/2019
Field of study

Power system security assessment methods require large datasets of operating points to train or test their performance. As historical data often contain limited number of abnormal situations, simulation data are necessary to accurately determine the security boundary. Generating such a database is an extremely demanding task, which becomes intractable even for small system sizes. This paper proposes a modular and highly scalable algorithm for computationally efficient database generation. Using convex relaxation techniques and complex network theory, we discard large infeasible regions and drastically reduce the search space. We explore the remaining space by a highly parallelizable algorithm and substantially decrease computation time. Our method accommodates numerous definitions of power system security. Here we focus on the combination of N-k security and small-signal stability. Demonstrating our algorithm on IEEE 14-bus and NESTA 162-bus systems, we show how it outperforms existing approaches requiring less than 10% of the time other methods require.Comment: Database publicly available at: https://github.com/johnnyDEDK/OPs_Nesta162Bus - Paper accepted for publication at IEEE Transactions on Power System

arXiv.org e-Print Archive

Online Research Database In Technology