15,507 research outputs found
Trip purpose identification using pairwise constraints based semi-supervised clustering
Clustering of smart card data captured by automated fare collection (AFC) systems has traditionally
been viewed as an unsupervised method. However, the small number of labelled data points in addition
to the unlabelled smart card data can facilitate better partitioning and classification. In this paper, prior
knowledge about the activities is translated into pairwise constraints and used in COP-KMEANS
clustering algorithm to identify the trip purpose. The effectiveness of the method was evaluated by
comparison of the results with the ground truth. The results demonstrate that semi-supervised clustering
enhances the accuracy of the trip purpose identification
Semi-supervised dimensionality reduction using pairwise equivalence constraints
International audienceTo deal with the problem of insufficient labeled data, usually side information -- given in the form of pairwise equivalence constraints between points -- is used to discover groups within data. However, existing methods using side information typically fail in cases with high-dimensional spaces. In this paper, we address the problem of learning from side information for high-dimensional data. To this end, we propose a semi-supervised dimensionality reduction scheme that incorporates pairwise equivalence constraints for finding a better embedding space, which improves the performance of subsequent clustering and classification phases. Our method builds on the assumption that points in a sufficiently small neighborhood tend to have the same label. Equivalence constraints are employed to modify the neighborhoods and to increase the separability of different classes. Experimental results on high-dimensional image data sets show that integrating side information into the dimensionality reduction improves the clustering and classification performance
- …