Active classification, i.e., the sequential decision-making process aimed at
data acquisition for classification purposes, arises naturally in many
applications, including medical diagnosis, intrusion detection, and object
tracking. In this work, we study the problem of actively classifying dynamical
systems with a finite set of Markov decision process (MDP) models. We are
interested in finding strategies that actively interact with the dynamical
system, and observe its reactions so that the true model is determined
efficiently with high confidence. To this end, we present a decision-theoretic
framework based on partially observable Markov decision processes (POMDPs). The
proposed framework relies on assigning a classification belief (a probability
distribution) to each candidate MDP model. Given an initial belief, some
misclassification probabilities, a cost bound, and a finite time horizon, we
design POMDP strategies leading to classification decisions. We present two
different approaches to find such strategies. The first approach computes the
optimal strategy "exactly" using value iteration. To overcome the computational
complexity of finding exact solutions, the second approach is based on adaptive
sampling to approximate the optimal probability of reaching a classification
decision. We illustrate the proposed methodology using two examples from
medical diagnosis and intruder detection