Many agents need to learn to operate in dynamic environments characterized by occasional but significant changes. It is advantageous for such agents to have the capability to selectively retain appropriate knowledge while modifying obsolete knowledge after the environmental conditions change. Furthermore, it may be advantageous for agents to recognize revisitation of previously experienced environmental conditions, and revert to a knowledge state previously learned under those conditions. Many current function approximation techniques, while powerful in their generality, do not allow for such retention due to the fact that they do not explicitly relate domain knowledge with value estimation. We describe a technique, called hierarchical judgement composition, that does specify domain knowledge in the form of predictions about future events, and associates it with the intermediate representations used by the mechanism for generating state abstractions. Preliminary experimental results in the domain of turn-based strategy game playing show promise with respect to the desired characteristics
To submit an update or takedown request for this paper, please submit an Update/Correction/Removal Request.