124,682 research outputs found

    Implementation of the Random Forest Method for the Imaging Atmospheric Cherenkov Telescope MAGIC

    Get PDF
    The paper describes an application of the tree classification method Random Forest (RF), as used in the analysis of data from the ground-based gamma telescope MAGIC. In such telescopes, cosmic gamma-rays are observed and have to be discriminated against a dominating background of hadronic cosmic-ray particles. We describe the application of RF for this gamma/hadron separation. The RF method often shows superior performance in comparison with traditional semi-empirical techniques. Critical issues of the method and its implementation are discussed. An application of the RF method for estimation of a continuous parameter from related variables, rather than discrete classes, is also discussed.Comment: 16 pages, 8 figure

    Overview of Random Forest Methodology and Practical Guidance with Emphasis on Computational Biology and Bioinformatics

    Get PDF
    The Random Forest (RF) algorithm by Leo Breiman has become a standard data analysis tool in bioinformatics. It has shown excellent performance in settings where the number of variables is much larger than the number of observations, can cope with complex interaction structures as well as highly correlated variables and returns measures of variable importance. This paper synthesizes ten years of RF development with emphasis on applications to bioinformatics and computational biology. Special attention is given to practical aspects such as the selection of parameters, available RF implementations, and important pitfalls and biases of RF and its variable importance measures (VIMs). The paper surveys recent developments of the methodology relevant to bioinformatics as well as some representative examples of RF applications in this context and possible directions for future research

    Visualizing Random Forest with Self-Organising Map

    Full text link
    Random Forest (RF) is a powerful ensemble method for classification and regression tasks. It consists of decision trees set. Although, a single tree is well interpretable for human, the ensemble of trees is a black-box model. The popular technique to look inside the RF model is to visualize a RF proximity matrix obtained on data samples with Multidimensional Scaling (MDS) method. Herein, we present a novel method based on Self-Organising Maps (SOM) for revealing intrinsic relationships in data that lay inside the RF used for classification tasks. We propose an algorithm to learn the SOM with the proximity matrix obtained from the RF. The visualization of RF proximity matrix with MDS and SOM is compared. What is more, the SOM learned with the RF proximity matrix has better classification accuracy in comparison to SOM learned with Euclidean distance. Presented approach enables better understanding of the RF and additionally improves accuracy of the SOM

    Comparison of Deep Learning and the Classical Machine Learning Algorithm for the Malware Detection

    Full text link
    Recently, Deep Learning has been showing promising results in various Artificial Intelligence applications like image recognition, natural language processing, language modeling, neural machine translation, etc. Although, in general, it is computationally more expensive as compared to classical machine learning techniques, their results are found to be more effective in some cases. Therefore, in this paper, we investigated and compared one of the Deep Learning Architecture called Deep Neural Network (DNN) with the classical Random Forest (RF) machine learning algorithm for the malware classification. We studied the performance of the classical RF and DNN with 2, 4 & 7 layers architectures with the four different feature sets, and found that irrespective of the features inputs, the classical RF accuracy outperforms the DNN.Comment: 11 Pages, 1 figur
    corecore