The importance of categorical reasoning in human cognition is well-established in psychology and cognitive science, and it is generally acknowledged that one of the most important functions of categorization is to facilitate prediction. This paper provides a model of optimal categorization. In the beginning of each period a subject observes a two-dimensional object in one dimension and wants to predict the object's value in the other dimension. The subject partitions the space of objects into categories. She has a data base of objects that were observed in both dimensions in the past. The subject determines what category the new object belongs to on the basis of observation of its first dimension. The average value in the second dimension, of objects in this category in the data base, is used as prediction for the object at hand. At the end of each period the second dimension is observed and the observation is stored in the data base. The main result is that the optimal number of categories is determined by a trade-off between (a) decreasing the size of categories in order to enhance category homogeneity, and (b) increasing the size of categories in order to enhance category sample size.Categorization; Priors; Prediction; Similarity-Based Reasoning.
To submit an update or takedown request for this paper, please submit an Update/Correction/Removal Request.