5 research outputs found

    Banzhaf random forests: Cooperative game theory based random forests with consistency.

    Get PDF
    Random forests algorithms have been widely used in many classification and regression applications. However, the theory of random forests lags far behind their applications. In this paper, we propose a novel random forests classification algorithm based on cooperative game theory. The Banzhaf power index is employed to evaluate the power of each feature by traversing possible feature coalitions. Hence, we call the proposed algorithm Banzhaf random forests (BRFs). Unlike the previously used information gain ratio, which only measures the power of each feature for classification and pays less attention to the intrinsic structure of the feature variables, the Banzhaf power index can measure the importance of each feature by computing the dependency among the group of features. More importantly, we have proved the consistency of BRFs, which narrows the gap between the theory and applications of random forests. Extensive experiments on several UCI benchmark data sets and three real world applications show that BRFs perform significantly better than existing consistent random forests on classification accuracy, and better than or at least comparable with Breiman’s random forests, support vector machines (SVMs) and k-nearest neighbors (KNNs) classifiers

    Random Multi-Graphs: A semi-supervised learning framework for classification of high dimensional data.

    Get PDF
    Currently, high dimensional data processing confronts two main difficulties: inefficient similarity measure and high computational complexity in both time and memory space. Common methods to deal with these two difficulties are based on dimensionality reduction and feature selection. In this paper, we present a different way to solve high dimensional data problems by combining the ideas of Random Forests and Anchor Graph semi-supervised learning. We randomly select a subset of features and use the Anchor Graph method to construct a graph. This process is repeated many times to obtain multiple graphs, a process which can be implemented in parallel to ensure runtime efficiency. Then the multiple graphs vote to determine the labels for the unlabeled data. We argue that the randomness can be viewed as a kind of regularization. We evaluate the proposed method on eight real-world data sets by comparing it with two traditional graph-based methods and one state-of-the-art semi-supervised learning method based on Anchor Graph to show its effectiveness. We also apply the proposed method to the subject of face recognition

    Cooperative Profit Random Forests With Application in Ocean Front Recognition.

    Get PDF
    Random Forests are powerful classification and regression tools that are commonly applied in machine learning and image processing. In the majority of random classification forests algorithms, the Gini index and the information gain ratio are commonly used for node splitting. However, these two kinds of node-split methods may pay less attention to the intrinsic structure of the attribute variables and fail to find attributes with strong discriminate ability as a group yet weak as individuals. In this paper, we propose an innovative method for splitting the tree nodes based on the cooperative game theory, from which some attributes with good discriminate ability as a group can be learned. This new random forests algorithm is called Cooperative Profit Random Forests (CPRF). Experimental comparisons with several other existing random classification forests algorithms are carried out on several real-world data sets, including remote sensing images. The results show that CPRF outperforms other existing Random Forests algorithms in most cases. In particular, CPRF achieves promising results in ocean front recognition

    Random Forests Algorithm for Two Levels of Coral Reef Ecosystem Mapping Using Planetscope Image in Malalayang Beach, Manado

    Get PDF
    The coral reef ecosystem has a significant physical and biological function and is also one of the coastal ecosystem components apart from the seagrass and mangrove ecosystem. Besides their ecological function, the coral reef also has an economic function. The condition of the coral reef ecosystem in Malalayang Beach has been changing for years. The utilization of remote sensing images can monitor current conditions. This research aims to map the coral reef ecosystem mapping in Malalayang Beach, Manado and conduct a test for the accuracy of coral reef ecosystem mapping using field survey data as a classification and validation sample. PlanetScope multispectral image has four channels to detect underwater objects: red, green, blue and near infrared. PlanetScope level 3B image for the research has a surface reflectance value for its pixel. The image processing stages of this research consist of sunglint correction, water column correction, and then continue to classify the coral reef ecosystem using random forests algorithm. Classification and accuracy training sample data were obtained using the photo transect technique. The sunglint correction regression equation is between 0.27 – 0.38. The coefficient of attenuation ratio in B1 is 0.927797938, B2 is 0.168841585, and B3 is 0.29033029. This value then becomes the input for the Lyzenga formula. The classification accuracy for level one using random forests is 72,54%, and the accuracy for level two mapping is 37,61%. Keywords: Coral Reef Ecosystem, Planetscope, Random Forest
    corecore