10 research outputs found

    Bootstrapping a DQN Replay Memory with Synthetic Experiences

    Get PDF
    An important component of many Deep Reinforcement Learning algorithms is the Experience Replay which serves as a storage mechanism or memory of made experiences. These experiences are used for training and help the agent to stably find the perfect trajectory through the problem space. The classic Experience Replay however makes only use of the experiences it actually made, but the stored samples bear great potential in form of knowledge about the problem that can be extracted. We present an algorithm that creates synthetic experiences in a nondeterministic discrete environment to assist the learner. The Interpolated Experience Replay is evaluated on the FrozenLake environment and we show that it can support the agent to learn faster and even better than the classic version

    Averaging rewards as a first approach towards interpolated experience replay

    Get PDF
    Reinforcement learning and especially deep reinforcement learning are research areas which are getting more and more attention. The mathematical method of interpolation is used to get information of data points in an area where only neighboring samples are known and thus seems like a good expansion for the experience replay which is a major component of a variety of deep reinforcement learning methods. Interpolated experiences stored in the experience replay could speed up learning in the early phase and reduce the overall amount of exploration needed. A first approach of averaging rewards in a setting with unstable transition function and very low exploration is implemented and shows promising results that encourage further investigation

    Synthetic experiences for accelerating DQN performance in discrete non-deterministic environments

    Get PDF
    State-of-the-art Deep Reinforcement Learning Algorithms such as DQN and DDPG use the concept of a replay buffer called Experience Replay. The default usage contains only the experiences that have been gathered over the runtime. We propose a method called Interpolated Experience Replay that uses stored (real) transitions to create synthetic ones to assist the learner. In this first approach to this field, we limit ourselves to discrete and non-deterministic environments and use a simple equally weighted average of the reward in combination with observed follow-up states. We could demonstrate a significantly improved overall mean average in comparison to a DQN network with vanilla Experience Replay on the discrete and non-deterministic FrozenLake8x8-v0 environment

    D2.2 Uncertainty-aware sensor fusion in sensor networks

    No full text
    Uncertainty assignment for measurement values is a common process for single sensors, but this procedure grows in complexity for sensor networks. Often measured values are processed further in such networks and uncertainty must be evaluated for virtual values. A simple example is the fusion of homogeneous values and faulty or drifting sensors can harm the virtual value. We introduce a method from the field of key-comparison into the domain of sensor fusion. The method is evaluated in three different scenarios within an agent-framework
    corecore