2 research outputs found
Information-Maximization Clustering based on Squared-Loss Mutual Information
Information-maximization clustering learns a probabilistic classifier in an
unsupervised manner so that mutual information between feature vectors and
cluster assignments is maximized. A notable advantage of this approach is that
it only involves continuous optimization of model parameters, which is
substantially easier to solve than discrete optimization of cluster
assignments. However, existing methods still involve non-convex optimization
problems, and therefore finding a good local optimal solution is not
straightforward in practice. In this paper, we propose an alternative
information-maximization clustering method based on a squared-loss variant of
mutual information. This novel approach gives a clustering solution
analytically in a computationally efficient way via kernel eigenvalue
decomposition. Furthermore, we provide a practical model selection procedure
that allows us to objectively optimize tuning parameters included in the kernel
function. Through experiments, we demonstrate the usefulness of the proposed
approach
A semi-supervised feature clustering algorithm with application to word sense disambiguation
In this paper we investigate an application of feature clustering for word sense disambiguation, and propose a semisupervised feature clustering algorithm. Compared with other feature clustering methods (ex. supervised feature clustering), it can infer the distribution of class labels over (unseen) features unavailable in training data (labeled data) by the use of the distribution of class labels over (seen) features available in training data. Thus, it can deal with both seen and unseen features in feature clustering process. Our experimental results show that feature clustering can aggressively reduce the dimensionality of feature space, while still maintaining state of the art sense disambiguation accuracy. Furthermore, when combined with a semi-supervised WSD algorithm, semi-supervised feature clustering outperforms other dimensionality reduction techniques, which indicates that using unlabeled data in learning process helps to improve the performance of feature clustering and sense disambiguation.