In this paper we propose a mathematical learning model for a stochastic automaton simulating the
behaviour of a predator operating in a random environment occupied by two types of prey:
palatable mimics and unpalatable models. Specifically, a well known linear reinforcement learning
algorithm is used to update the probabilities of the two actions, eat prey or ignore prey, at every
random encounter. Each action elicits a probabilistic response from the environment that can be
either favorable or unfavourable. We analyse both fixed and varying stochastic responses for the
system. The basic approach of mimicry is defined and a short review of relevant previous approaches in
the literature is given. Finally, the conditions for continuous predator performance improvement are
explicitly formulated and precise definitions of predatory efficiency and mimicry efficiency are
also provided