Drawing an inspiration from behavioral studies of human decision making, we
propose here a general parametric framework for multi-armed bandit problem,
which extends the standard Thompson Sampling approach to incorporate reward
processing biases associated with several neurological and psychiatric
conditions, including Parkinson's and Alzheimer's diseases,
attention-deficit/hyperactivity disorder (ADHD), addiction, and chronic pain.
We demonstrate empirically that the proposed parametric approach can often
outperform the baseline Thompson Sampling on a variety of datasets. Moreover,
from the behavioral modeling perspective, our parametric framework can be
viewed as a first step towards a unifying computational model capturing reward
processing abnormalities across multiple mental conditions.Comment: Conference on Artificial General Intelligence, AGI-1