16 research outputs found

    Variational Bayesian Filtering

    No full text

    Informed Sampling of Prioritized Experience Replay

    No full text
    Experience replay plays an essential role as an information-generating mechanism in reinforcement learning systems that use neural networks as function approximators. It enables the artificial learning agents to store their past experiences in a sliding-window buffer, effectively recycling them in the process of a continual re-training of a neural network. The intermediary process of experience caching opens a possibility for an agent to optimize the order in which the experiences are sampled from the buffer. This may improve the default standard, i.e., the stochastic prioritization based on Temporal-Difference error (or TD-error), which focuses on experiences that carry more Temporal-Difference surprise for the approximator. A notion of informed prioritization is proposed, a method relying on fast on-line confidence estimates of approximator predictions in order to be able to dynamically exploit the benefits of TD-error prioritization only when its prediction confidence about the selected experiences increases. The presented informed-stochastic prioritization method of replay buffer sampling, implemented as a part of standard staple Deep Q-learning algorithm outperformed the vanilla stochastic prioritization based on TD-error in 41 out of 54 trialed Atari games

    Design of LVCSR Decoder for Czech Language

    No full text
    In this paper we present a Czech speaker-independent large vocabulary continuous speech recognition (LVCSR) system based on lexical trees and bigram language model. Lexical trees use triphones for both the in-word and the cross-word context. A dynamically generated cross-word context saves important amount of memory. A telephone speech and text corpus have been used to evaluate the system accuracy and speed. The corpus was used to compare our recognizer with the standard HTK recognizer. The comparison results are shown

    Sensorless control of PMSM using an adaptively tuned SCKF

    No full text
    corecore