8 research outputs found
Recommended from our members
SMART (Stochastic Model Acquisition with ReinforcemenT) learning agents: A preliminary report
We present a framework for building agents that learn using SMART, a system that combines stochastic model acquisition with reinforcement learning to enable an agent to model its environment through experience and subsequently form action selection policies using the acquired model. We extend an existing algorithm for automatic creation of stochastic strips operators [9] as a preliminary method of environment modelling. We then define the process of generation of future states using these operators and an initial state and finally show the process by which the agent can use the generated states to form a policy with a standard reinforcement learning algorithm. The potential of SMART is exemplified using the well-known predator prey scenario. Results of applying SMART to this environment and directions for future work are discussed
Neural networks for computer virus recognition
We have developed a neural network for generic detection of a particular class of computer viruses-the so called boot sector viruses that infect the boot sector of a floppy disk or a hard drive. This is an important and relatively tractable subproblem of generic virus detection. Only about 5% of all known viruses are boot sector viruses, yet they account for nearly 90% of all virus incidents. We have successfully deployed our neural network as a commercial product, distributing it to millions of PC users worldwide as part of the IBM AntiVirus software package. We faced several challenges in taking our neural network from a research idea to a commercial product. These included designing an appropriate input representation scheme; dealing with the scarcity of available training data; finding an appropriate trade off point between false positives and false negatives to conform to user expectations; and making the software conform to strict constraints on memory and speed of computation needed to run on PCs. The article discusses our methods for handling these challenges
Convergence and divergence in standard and averaging reinforcement learning
Abstract. Although tabular reinforcement learning (RL) methods have been proved to converge to an optimal policy, the combination of particular conventional reinforcement learning techniques with function approximators can lead to divergence. In this paper we show why off-policy RL methods combined with linear function approximators can lead to divergence. Furthermore, we analyze two different types of updates; standard and averaging RL updates. Although averaging RL will not diverge, we show that they can converge to wrong value functions. In our experiments we compare standard to averaging value iteration (VI) with CMACs and the results show that for small values of the discount factor averaging VI works better, whereas for large values of the discount factor standard VI performs better, although it does not always converge.