65,514 research outputs found

    Interactive Causal Correlation Space Reshape for Multi-Label Classification

    Get PDF
    Most existing multi-label classification models focus on distance metrics and feature spare strategies to extract specific features of labels. Those models use the cosine similarity to construct the label correlation matrix to constraint solution space, and then mine the latent semantic information of the label space. However, the label correlation matrix is usually directly added to the model, which ignores the interactive causality of the correlation between the labels. Considering the label-specific features based on the distance method merely may have the problem of distance measurement failure in the high-dimensional space, while based on the sparse weight matrix method may cause the problem that parameter is dependent on manual selection. Eventually, this leads to poor classifier performance. In addition, it is considered that logical labels cannot describe the importance of different labels and cannot fully express semantic information. Based on these, we propose an Interactive Causal Correlation Space Reshape for Multi-Label Classification (CCSRMC) algorithm. Firstly, the algorithm constructs the label propagation matrix using characteristic that similar instances can be linearly represented by each other. Secondly, label co-occurrence matrix is constructed by combining the conditional probability test method, which is based on the label propagation reshaping the label space to rich label semantics. Then the label co-occurrence matrix combines with the label correlation matrix to construct the label interactive causal correlation matrix to perform multi-label classification learning on the obtained numerical label matrix. Finally, the algorithm in this paper is compared with multiple advanced algorithms on multiple benchmark multi-label datasets. The results show that considering the interactive causal label correlation can reduce the redundant information in the model and improve the performance of the multi-label classifier

    Enhancing Learning Object Analysis through Fuzzy C-Means Clustering and Web Mining Methods

    Get PDF
    The development of learning objects (LO) and e-pedagogical practices has significantly influenced and changed the performance of e-learning systems. This development promotes a genuine sharing of resources and creates new opportunities for learners to explore them easily. Therefore, the need for a system of categorization for these objects becomes mandatory. In this vein, classification theories combined with web mining techniques can highlight the performance of these LOs and make them very useful for learners. This study consists of two main phases. First, we extract metadata from learning objects, using the algorithm of Web exploration techniques such as feature selection techniques, which are mainly implemented to find the best set of features that allow us to build useful models. The key role of feature selection in learning object classification is to identify pertinent features and eliminate redundant features from an excessively dimensional dataset. Second, we identify learning objects according to a particular form of similarity using Multi-Label Classification (MLC) based on Fuzzy C-Means (FCM) algorithms. As a clustering algorithm, Fuzzy C-Means is used to perform classification accuracy according to Euclidean distance metrics as similarity measurement. Finally, to assess the effectiveness of LOs with FCM, a series of experimental studies using a real-world dataset were conducted. The findings of this study indicate that the proposed approach exceeds the traditional approach and leads to viable results. Doi: 10.28991/ESJ-2023-07-03-010 Full Text: PD

    Nearest Labelset Using Double Distances for Multi-label Classification

    Full text link
    Multi-label classification is a type of supervised learning where an instance may belong to multiple labels simultaneously. Predicting each label independently has been criticized for not exploiting any correlation between labels. In this paper we propose a novel approach, Nearest Labelset using Double Distances (NLDD), that predicts the labelset observed in the training data that minimizes a weighted sum of the distances in both the feature space and the label space to the new instance. The weights specify the relative tradeoff between the two distances. The weights are estimated from a binomial regression of the number of misclassified labels as a function of the two distances. Model parameters are estimated by maximum likelihood. NLDD only considers labelsets observed in the training data, thus implicitly taking into account label dependencies. Experiments on benchmark multi-label data sets show that the proposed method on average outperforms other well-known approaches in terms of Hamming loss, 0/1 loss, and multi-label accuracy and ranks second after ECC on the F-measure
    • …
    corecore