147 research outputs found

    Dual-random ensemble method for multi-label classification of biological data

    Full text link
    This paper presents a dual-random ensemble multi-label classification method for classification of multi-label data. The method is formed by integrating and extending the concepts of feature subspace method and random k-label set ensemble multi-label classification method. Experiemental results show that the developed method outperforms the exisiting multi-lable classification methods on three different multi-lable datasets including the biological yeast and genbase datasets.<br /

    Multi-dimensional classification with super-classes

    Full text link
    The multi-dimensional classification problem is a generalisation of the recently-popularised task of multi-label classification, where each data instance is associated with multiple class variables. There has been relatively little research carried out specific to multi-dimensional classification and, although one of the core goals is similar (modelling dependencies among classes), there are important differences; namely a higher number of possible classifications. In this paper we present method for multi-dimensional classification, drawing from the most relevant multi-label research, and combining it with important novel developments. Using a fast method to model the conditional dependence between class variables, we form super-class partitions and use them to build multi-dimensional learners, learning each super-class as an ordinary class, and thus explicitly modelling class dependencies. Additionally, we present a mechanism to deal with the many class values inherent to super-classes, and thus make learning efficient. To investigate the effectiveness of this approach we carry out an empirical evaluation on a range of multi-dimensional datasets, under different evaluation metrics, and in comparison with high-performing existing multi-dimensional approaches from the literature. Analysis of results shows that our approach offers important performance gains over competing methods, while also exhibiting tractable running time

    Clustering based multi-label classification for image annotation and retrieval

    Full text link
    This paper presents a novel multi-label classification framework for domains with large numbers of labels. Automatic image annotation is such a domain, as the available semantic concepts are typically hundreds. The proposed framework comprises an initial clustering phase that breaks the original training set into several disjoint clusters of data. It then trains a multi-label classifier from the data of each cluster. Given a new test instance, the framework first finds the nearest cluster and then applies the corresponding model. Empirical results using two clustering algorithms, four multi-label classification algorithms and three image annotation data sets suggest that the proposed approach can improve the performance and reduce the training time of standard multi-label classification algorithms, particularly in the case of large number of labels.<br /

    Multilabel classification by BCH code and random forests

    Full text link
    This paper uses error correcting codes for multilabel classification. BCH code and random forests learner are used to form the proposed method. Thus, the advantage of the error-correcting properties of BCH is merged with the good performance of the random forests learner to enhance the multilabel classification results. Three experiments are conducted on three common benchmark datasets. The results are compared against those of several exiting approaches. The proposed method does well against its counterparts for the three datasets of varying characteristics.<br /
    corecore