1 research outputs found

    Model selection for semi-supervised clustering

    Get PDF
    Although there is a large and growing literature that tackles the semi-supervised clustering problem (i.e., using some labeled objects or cluster-guiding constraints like \must-link" or \cannot-link"), the evaluation of semi-supervised clustering approaches has rarely been discussed. The application of cross-validation techniques, for example, is far from straightforward in the semi-supervised setting, yet the problems associated with evaluation have yet to be addressed. Here we\ud summarize these problems and provide a solution.\ud Furthermore, in order to demonstrate practical applicability of semi-supervised clustering methods, we provide a method for model selection in semi-supervised clustering based on this sound evaluation procedure. Our method allows the user to select, based on the available information\ud (labels or constraints), the most appropriate clustering model (e.g., number of clusters, density-parameters) for a given problem.NSERC (Canada)FAPESP (Brazil)CNPq (Brazil