The multi-armed bandit is a natural means of representing customer switching in response to “noisy ” quality or value. The complexity of many choice models makes them difficult to work with analytically in the bandit setting, however. While there exist stylized models that are more tractable, their ability to represent actual choice behavior has not been thoroughly examined. In this paper, we investigate the usefulness of simple models of choice in bandit problems. One set is derived from normative theory, and the other includes descriptive models from the psychology of choice behavior. We test their performance in experiments in which subjects solve two-armed Bernoulli bandit problems. We find that the most tractable of the models perform best in tests of model fit. We also find that measures of the expected run length – the expected number of consecutive pulls on a given arm – are increasing and convex with respect to the underlying arm probabilities. Thanks to Gabriel Silvasi for developing the software for the experiments, as well as for providing computing support for the analysis of the data. Thanks also to Eliza Pons, who facilitated the testing of the software and recruiting of subjects. We are grateful to Lawrence Brown, Dean Foster, Wes Hutchinson, and Bob Meyer for suggestions concernin
To submit an update or takedown request for this paper, please submit an Update/Correction/Removal Request.