69 research outputs found

    A Review on Advanced Decision Trees for Efficient & Effective k-NN Classification

    Get PDF
    K Nearest Neighbor (KNN) strategy is a notable classification strategy in data mining and estimations in light of its direct execution and colossal arrangement execution. In any case, it is outlandish for ordinary KNN strategies to select settled k esteem to all tests. Past courses of action assign different k esteems to different test tests by the cross endorsement strategy however are typically tedious. This work proposes new KNN strategies, first is a KTree strategy to learn unique k esteems for different test or new cases, by including a training arrange in the KNN classification. This work additionally proposes a change rendition of KTree technique called K*Tree to speed its test organize by putting additional data of the training tests in the leaf node of KTree, for example, the training tests situated in the leaf node, their KNNs, and the closest neighbor of these KNNs. K*Tree, which empowers to lead KNN arrangement utilizing a subset of the training tests in the leaf node instead of all training tests utilized in the recently KNN techniques. This really reduces the cost of test organize

    RSVD-CUR Decomposition for Matrix Triplets

    Full text link
    We propose a restricted SVD based CUR (RSVD-CUR) decomposition for matrix triplets (A,B,G)(A, B, G). Given matrices AA, BB, and GG of compatible dimensions, such a decomposition provides a coordinated low-rank approximation of the three matrices using a subset of their rows and columns. We pick the subset of rows and columns of the original matrices by applying either the discrete empirical interpolation method (DEIM) or the L-DEIM scheme on the orthogonal and nonsingular matrices from the restricted singular value decomposition of the matrix triplet. We investigate the connections between a DEIM type RSVD-CUR approximation and a DEIM type CUR factorization, and a DEIM type generalized CUR decomposition. We provide an error analysis that shows that the accuracy of the proposed RSVD-CUR decomposition is within a factor of the approximation error of the restricted singular value decomposition of given matrices. An RSVD-CUR factorization may be suitable for applications where we are interested in approximating one data matrix relative to two other given matrices. Two applications that we discuss include multi-view/label dimension reduction, and data perturbation problems of the form AE=A+BFGA_E=A + BFG, where BFGBFG is a nonwhite noise matrix. In numerical experiments, we show the advantages of the new method over the standard CUR approximation for these applications

    Recent Advances of Manifold Regularization

    Get PDF
    Semi-supervised learning (SSL) that can make use of a small number of labeled data with a large number of unlabeled data to produce significant improvement in learning performance has been received considerable attention. Manifold regularization is one of the most popular works that exploits the geometry of the probability distribution that generates the data and incorporates them as regularization terms. There are many representative works of manifold regularization including Laplacian regularization (LapR), Hessian regularization (HesR) and p-Laplacian regularization (pLapR). Based on the manifold regularization framework, many extensions and applications have been reported. In the chapter, we review the LapR and HesR, and we introduce an approximation algorithm of graph p-Laplacian. We study several extensions of this framework for pairwise constraint, p-Laplacian learning, hypergraph learning, etc

    Transmission line fault-cause identification based on hierarchical multiview feature selection

    Get PDF
    Copyright: © 2021 by the authors. Fault-cause identification plays a significant role in transmission line maintenance and fault disposal. With the increasing types of monitoring data, i.e., micrometeorology and geographic information, multiview learning can be used to realize the information fusion for better fault-cause identification. To reduce the redundant information of different types of monitoring data, in this paper, a hierarchical multiview feature selection (HMVFS) method is proposed to address the challenge of combining waveform and contextual fault features. To enhance the discriminant ability of the model, an ε-dragging technique is introduced to enlarge the boundary between different classes. To effectively select the useful feature subset, two regularization terms, namely l2,1-norm and Frobenius norm penalty, are adopted to conduct the hierarchical feature selection for multiview data. Subsequently, an iterative optimization algorithm is developed to solve our proposed method, and its convergence is theoretically proven. Waveform and contextual features are extracted from yield data and used to evaluate the proposed HMVFS. The experimental results demonstrate the effectiveness of the combined used of fault features and reveal the superior performance and application potential of HMVFS.National Natural Science Foundation of China (61903091) and the Science and Technology Project of China Southern Power Grid Company Limited (031800KK52180074)

    A generic Self-Supervised Learning (SSL) framework for representation learning from spectral–spatial features of unlabeled remote sensing imagery

    Get PDF
    Remote sensing data has been widely used for various Earth Observation (EO) missions such as land use and cover classification, weather forecasting, agricultural management, and environmental monitoring. Most existing remote-sensing-data-based models are based on supervised learning that requires large and representative human-labeled data for model training, which is costly and time-consuming. The recent introduction of self-supervised learning (SSL) enables models to learn a representation from orders of magnitude more unlabeled data. The success of SSL is heavily dependent on a pre-designed pretext task, which introduces an inductive bias into the model from a large amount of unlabeled data. Since remote sensing imagery has rich spectral information beyond the standard RGB color space, it may not be straightforward to extend to the multi/hyperspectral domain the pretext tasks established in computer vision based on RGB images. To address this challenge, this work proposed a generic self-supervised learning framework based on remote sensing data at both the object and pixel levels. The method contains two novel pretext tasks, one for object-based and one for pixel-based remote sensing data analysis methods. One pretext task is used to reconstruct the spectral profile from the masked data, which can be used to extract a representation of pixel information and improve the performance of downstream tasks associated with pixel-based analysis. The second pretext task is used to identify objects from multiple views of the same object in multispectral data, which can be used to extract a representation and improve the performance of downstream tasks associated with object-based analysis. The results of two typical downstream task evaluation exercises (a multilabel land cover classification task on Sentinel-2 multispectral datasets and a ground soil parameter retrieval task on hyperspectral datasets) demonstrate that the proposed SSL method learns a target representation that covers both spatial and spectral information from massive unlabeled data. A comparison with currently available SSL methods shows that the proposed method, which emphasizes both spectral and spatial features, outperforms existing SSL methods on multi- and hyperspectral remote sensing datasets. We believe that this approach has the potential to be effective in a wider range of remote sensing applications and we will explore its utility in more remote sensing applications in the future

    Coarse-to-Fine Contrastive Learning on Graphs

    Full text link
    Inspired by the impressive success of contrastive learning (CL), a variety of graph augmentation strategies have been employed to learn node representations in a self-supervised manner. Existing methods construct the contrastive samples by adding perturbations to the graph structure or node attributes. Although impressive results are achieved, it is rather blind to the wealth of prior information assumed: with the increase of the perturbation degree applied on the original graph, 1) the similarity between the original graph and the generated augmented graph gradually decreases; 2) the discrimination between all nodes within each augmented view gradually increases. In this paper, we argue that both such prior information can be incorporated (differently) into the contrastive learning paradigm following our general ranking framework. In particular, we first interpret CL as a special case of learning to rank (L2R), which inspires us to leverage the ranking order among positive augmented views. Meanwhile, we introduce a self-ranking paradigm to ensure that the discriminative information among different nodes can be maintained and also be less altered to the perturbations of different degrees. Experiment results on various benchmark datasets verify the effectiveness of our algorithm compared with the supervised and unsupervised models
    • …
    corecore