15,507 research outputs found

    Trip purpose identification using pairwise constraints based semi-supervised clustering

    Get PDF
    Clustering of smart card data captured by automated fare collection (AFC) systems has traditionally been viewed as an unsupervised method. However, the small number of labelled data points in addition to the unlabelled smart card data can facilitate better partitioning and classification. In this paper, prior knowledge about the activities is translated into pairwise constraints and used in COP-KMEANS clustering algorithm to identify the trip purpose. The effectiveness of the method was evaluated by comparison of the results with the ground truth. The results demonstrate that semi-supervised clustering enhances the accuracy of the trip purpose identification

    Semi-supervised dimensionality reduction using pairwise equivalence constraints

    Get PDF
    International audienceTo deal with the problem of insufficient labeled data, usually side information -- given in the form of pairwise equivalence constraints between points -- is used to discover groups within data. However, existing methods using side information typically fail in cases with high-dimensional spaces. In this paper, we address the problem of learning from side information for high-dimensional data. To this end, we propose a semi-supervised dimensionality reduction scheme that incorporates pairwise equivalence constraints for finding a better embedding space, which improves the performance of subsequent clustering and classification phases. Our method builds on the assumption that points in a sufficiently small neighborhood tend to have the same label. Equivalence constraints are employed to modify the neighborhoods and to increase the separability of different classes. Experimental results on high-dimensional image data sets show that integrating side information into the dimensionality reduction improves the clustering and classification performance
    • …
    corecore