Unsupervised learning algorithms are designed to extract structure from data without reference to explicit teacher information. The quality of the learned structure is determined by a cost function which guides the learning process. This paper proposes Empirical Risk Approximation as a new induction principle for unsupervised learning. The complexity of the unsupervised learning models are automatically controlled by the two conditions for learning: (i) the empirical risk of learning should uniformly converge towards the expected risk; (ii) the hypothesis class should retain a minimal variety for consistent inference. The maximal entropy principle with deterministic annealing as an efficient search strategy arises from the Empirical Risk Approximation principle as the optimal inference strategy for large learning problems. Parameter selection of learnable data structures is demonstrated for the case of k-means clustering. 1 What is unsupervised learning? Learning algorithms are desi..
To submit an update or takedown request for this paper, please submit an Update/Correction/Removal Request.