1,143 research outputs found

    Pairwise meta-rules for better meta-learning-based algorithm ranking

    Get PDF
    In this paper, we present a novel meta-feature generation method in the context of meta-learning, which is based on rules that compare the performance of individual base learners in a one-against-one manner. In addition to these new meta-features, we also introduce a new meta-learner called Approximate Ranking Tree Forests (ART Forests) that performs very competitively when compared with several state-of-the-art meta-learners. Our experimental results are based on a large collection of datasets and show that the proposed new techniques can improve the overall performance of meta-learning for algorithm ranking significantly. A key point in our approach is that each performance figure of any base learner for any specific dataset is generated by optimising the parameters of the base learner separately for each dataset

    A Framework to Discover Emerging Patterns for Application in Microarray Data

    Get PDF
    Various supervised learning and gene selection methods have been used for cancer diagnosis. Most of these methods do not consider interactions between genes, although this might be interesting biologically and improve classification accuracy. Here we introduce a new CART-based method to discover emerging patterns. Emerging patterns are structures of the form (X1>a1)AND(X2<a2) that have differing frequencies in the considered classes. Interaction structures of this kind are of great interest in cancer research. Moreover, they can be used to define new variables for classification. Using simulated data sets, we show that our method allows the identification of emerging patterns with high efficiency. We also perform classification using two publicly available data sets (leukemia and colon cancer). For each data set, the method allows efficient classification as well as the identification of interesting patterns

    Identification of Interaction Patterns and Classification with Applications to Microarray Data

    Get PDF
    Emerging patterns represent a class of interaction structures which has been recently proposed as a tool in data mining. In this paper, a new and more general definition refering to underlying probabilities is proposed. The defined interaction patterns carry information about the relevance of combinations of variables for distinguishing between classes. Since they are formally quite similar to the leaves of a classification tree, we propose a fast and simple method which is based on the CART algorithm to find the corresponding empirical patterns in data sets. In simulations, it can be shown that the method is quite effective in identifying patterns. In addition, the detected patterns can be used to define new variables for classification. Thus, we propose a simple scheme to use the patterns to improve the performance of classification procedures. The method may also be seen as a scheme to improve the performance of CARTs concerning the identification of interaction patterns as well as the accuracy of prediction

    A NETWORK INTRUSION DETECTION SYSTEM USING DECISION TREE MACHINE LEARNING ON AN ISTN ARCHITECTURE

    Get PDF
    In recent years, the Navy has shown interest in an integrated satellite-terrestrial networking (ISTN) architecture for unmanned systems. With the development of satellite networks and growing numbers of unmanned system networks being connected, security and privacy are major concerns in an ISTN. In this thesis, we develop a network intrusion detection system (NIDS) specifically for an ISTN. We identify the critical location of the NIDS within the ISTN architecture and use the decision tree machine learning algorithm to perform cyber-attack detection against various threat vectors, including distributed denial of service. The decision tree algorithm is used to classify and segregate attack traffic from benign traffic. We use an open source ISTN data set available in the literature to train our algorithm. The decision tree is implemented using different split criteria, varying number of splits, and the use of principal component analysis (PCA). We manipulate the size of the training data and the number of data features to achieve reasonable false positive rates. We show that our NIDS framework based on decision tree learning can effectively detect and segregate different attack data classes.Civilian, DSO National Labs, SingaporeApproved for public release. Distribution is unlimited
    corecore