2,471 research outputs found

    Exploring Restart Distributions

    Get PDF
    We consider the generic approach of using an experience memory to help exploration by adapting a restart distribution. That is, given the capacity to reset the state with those corresponding to the agent's past observations, we help exploration by promoting faster state-space coverage via restarting the agent from a more diverse set of initial states, as well as allowing it to restart in states associated with significant past experiences. This approach is compatible with both on-policy and off-policy methods. However, a caveat is that altering the distribution of initial states could change the optimal policies when searching within a restricted class of policies. To reduce this unsought learning bias, we evaluate our approach in deep reinforcement learning which benefits from the high representational capacity of deep neural networks. We instantiate three variants of our approach, each inspired by an idea in the context of experience replay. Using these variants, we show that performance gains can be achieved, especially in hard exploration problems.Comment: RLDM 201

    CSL model checking of Deterministic and Stochastic Petri Nets

    Get PDF
    Deterministic and Stochastic Petri Nets (DSPNs) are a widely used high-level formalism for modeling discrete-event systems where events may occur either without consuming time, after a deterministic time, or after an exponentially distributed time. The underlying process dened by DSPNs, under certain restrictions, corresponds to a class of Markov Regenerative Stochastic Processes (MRGP). In this paper, we investigate the use of CSL (Continuous Stochastic Logic) to express probabilistic properties, such a time-bounded until and time-bounded next, at the DSPN level. The verication of such properties requires the solution of the steady-state and transient probabilities of the underlying MRGP. We also address a number of semantic issues regarding the application of CSL on MRGP and provide numerical model checking algorithms for this logic. A prototype model checker, based on SPNica, is also described

    Efficient size estimation and impossibility of termination in uniform dense population protocols

    Full text link
    We study uniform population protocols: networks of anonymous agents whose pairwise interactions are chosen at random, where each agent uses an identical transition algorithm that does not depend on the population size nn. Many existing polylog(n)(n) time protocols for leader election and majority computation are nonuniform: to operate correctly, they require all agents to be initialized with an approximate estimate of nn (specifically, the exact value logn\lfloor \log n \rfloor). Our first main result is a uniform protocol for calculating log(n)±O(1)\log(n) \pm O(1) with high probability in O(log2n)O(\log^2 n) time and O(log4n)O(\log^4 n) states (O(loglogn)O(\log \log n) bits of memory). The protocol is converging but not terminating: it does not signal when the estimate is close to the true value of logn\log n. If it could be made terminating, this would allow composition with protocols, such as those for leader election or majority, that require a size estimate initially, to make them uniform (though with a small probability of failure). We do show how our main protocol can be indirectly composed with others in a simple and elegant way, based on the leaderless phase clock, demonstrating that those protocols can in fact be made uniform. However, our second main result implies that the protocol cannot be made terminating, a consequence of a much stronger result: a uniform protocol for any task requiring more than constant time cannot be terminating even with probability bounded above 0, if infinitely many initial configurations are dense: any state present initially occupies Ω(n)\Omega(n) agents. (In particular, no leader is allowed.) Crucially, the result holds no matter the memory or time permitted. Finally, we show that with an initial leader, our size-estimation protocol can be made terminating with high probability, with the same asymptotic time and space bounds.Comment: Using leaderless phase cloc

    Fault-tolerant computer study

    Get PDF
    A set of building block circuits is described which can be used with commercially available microprocessors and memories to implement fault tolerant distributed computer systems. Each building block circuit is intended for VLSI implementation as a single chip. Several building blocks and associated processor and memory chips form a self checking computer module with self contained input output and interfaces to redundant communications buses. Fault tolerance is achieved by connecting self checking computer modules into a redundant network in which backup buses and computer modules are provided to circumvent failures. The requirements and design methodology which led to the definition of the building block circuits are discussed

    Application of the sequence planner control framework to an intelligent automation system with a focus on error handling

    Get PDF
    Future automation systems are likely to include devices with a varying degree of autonomy, as well as advanced algorithms for perception and control. Human operators will be expected to work side by side with both collaborative robots performing assembly tasks and roaming robots that handle material transport. To maintain the flexibility provided by human operators when introducing such robots, these autonomous robots need to be intelligently coordinated, i.e., they need to be supported by an intelligent automation system. One challenge in developing intelligent automation systems is handling the large amount of possible error situations that can arise due to the volatile and sometimes unpredictable nature of the environment. Sequence Planner is a control framework that supports the development of intelligent automation systems. This paper describes Sequence Planner and tests its ability to handle errors that arise during execution of an intelligent automation system. An automation system, developed using Sequence Planner, is subjected to a number of scenarios where errors occur. The error scenarios and experimental results are presented along with a discussion of the experience gained in trying to achieve robust intelligent automation
    corecore