8 research outputs found

    Neural networks for computer virus recognition

    No full text
    We have developed a neural network for generic detection of a particular class of computer viruses-the so called boot sector viruses that infect the boot sector of a floppy disk or a hard drive. This is an important and relatively tractable subproblem of generic virus detection. Only about 5% of all known viruses are boot sector viruses, yet they account for nearly 90% of all virus incidents. We have successfully deployed our neural network as a commercial product, distributing it to millions of PC users worldwide as part of the IBM AntiVirus software package. We faced several challenges in taking our neural network from a research idea to a commercial product. These included designing an appropriate input representation scheme; dealing with the scarcity of available training data; finding an appropriate trade off point between false positives and false negatives to conform to user expectations; and making the software conform to strict constraints on memory and speed of computation needed to run on PCs. The article discusses our methods for handling these challenges

    Reinforcement Learning with Echo State Networks

    No full text

    Convergence and divergence in standard and averaging reinforcement learning

    No full text
    Abstract. Although tabular reinforcement learning (RL) methods have been proved to converge to an optimal policy, the combination of particular conventional reinforcement learning techniques with function approximators can lead to divergence. In this paper we show why off-policy RL methods combined with linear function approximators can lead to divergence. Furthermore, we analyze two different types of updates; standard and averaging RL updates. Although averaging RL will not diverge, we show that they can converge to wrong value functions. In our experiments we compare standard to averaging value iteration (VI) with CMACs and the results show that for small values of the discount factor averaging VI works better, whereas for large values of the discount factor standard VI performs better, although it does not always converge.
    corecore