2,471 research outputs found
Exploring Restart Distributions
We consider the generic approach of using an experience memory to help
exploration by adapting a restart distribution. That is, given the capacity to
reset the state with those corresponding to the agent's past observations, we
help exploration by promoting faster state-space coverage via restarting the
agent from a more diverse set of initial states, as well as allowing it to
restart in states associated with significant past experiences. This approach
is compatible with both on-policy and off-policy methods. However, a caveat is
that altering the distribution of initial states could change the optimal
policies when searching within a restricted class of policies. To reduce this
unsought learning bias, we evaluate our approach in deep reinforcement learning
which benefits from the high representational capacity of deep neural networks.
We instantiate three variants of our approach, each inspired by an idea in the
context of experience replay. Using these variants, we show that performance
gains can be achieved, especially in hard exploration problems.Comment: RLDM 201
CSL model checking of Deterministic and Stochastic Petri Nets
Deterministic and Stochastic Petri Nets (DSPNs) are a widely used high-level formalism for modeling discrete-event systems where events may occur either without consuming time, after a deterministic time, or after an exponentially distributed time. The underlying process dened by DSPNs, under certain restrictions, corresponds to a class of Markov Regenerative Stochastic Processes (MRGP). In this paper, we investigate the use of CSL (Continuous Stochastic Logic) to express probabilistic properties, such a time-bounded until and time-bounded next, at the DSPN level. The verication of such properties requires the solution of the steady-state and transient probabilities of the underlying MRGP. We also address a number of semantic issues regarding the application of CSL on MRGP and provide numerical model checking algorithms for this logic. A prototype model checker, based on SPNica, is also described
Efficient size estimation and impossibility of termination in uniform dense population protocols
We study uniform population protocols: networks of anonymous agents whose
pairwise interactions are chosen at random, where each agent uses an identical
transition algorithm that does not depend on the population size . Many
existing polylog time protocols for leader election and majority
computation are nonuniform: to operate correctly, they require all agents to be
initialized with an approximate estimate of (specifically, the exact value
). Our first main result is a uniform protocol for
calculating with high probability in time and
states ( bits of memory). The protocol is
converging but not terminating: it does not signal when the estimate is close
to the true value of . If it could be made terminating, this would
allow composition with protocols, such as those for leader election or
majority, that require a size estimate initially, to make them uniform (though
with a small probability of failure). We do show how our main protocol can be
indirectly composed with others in a simple and elegant way, based on the
leaderless phase clock, demonstrating that those protocols can in fact be made
uniform. However, our second main result implies that the protocol cannot be
made terminating, a consequence of a much stronger result: a uniform protocol
for any task requiring more than constant time cannot be terminating even with
probability bounded above 0, if infinitely many initial configurations are
dense: any state present initially occupies agents. (In particular,
no leader is allowed.) Crucially, the result holds no matter the memory or time
permitted. Finally, we show that with an initial leader, our size-estimation
protocol can be made terminating with high probability, with the same
asymptotic time and space bounds.Comment: Using leaderless phase cloc
Fault-tolerant computer study
A set of building block circuits is described which can be used with commercially available microprocessors and memories to implement fault tolerant distributed computer systems. Each building block circuit is intended for VLSI implementation as a single chip. Several building blocks and associated processor and memory chips form a self checking computer module with self contained input output and interfaces to redundant communications buses. Fault tolerance is achieved by connecting self checking computer modules into a redundant network in which backup buses and computer modules are provided to circumvent failures. The requirements and design methodology which led to the definition of the building block circuits are discussed
Application of the sequence planner control framework to an intelligent automation system with a focus on error handling
Future automation systems are likely to include devices with a varying degree of autonomy, as well as advanced algorithms for perception and control. Human operators will be expected to work side by side with both collaborative robots performing assembly tasks and roaming robots that handle material transport. To maintain the flexibility provided by human operators when introducing such robots, these autonomous robots need to be intelligently coordinated, i.e., they need to be supported by an intelligent automation system. One challenge in developing intelligent automation systems is handling the large amount of possible error situations that can arise due to the volatile and sometimes unpredictable nature of the environment. Sequence Planner is a control framework that supports the development of intelligent automation systems. This paper describes Sequence Planner and tests its ability to handle errors that arise during execution of an intelligent automation system. An automation system, developed using Sequence Planner, is subjected to a number of scenarios where errors occur. The error scenarios and experimental results are presented along with a discussion of the experience gained in trying to achieve robust intelligent automation
- …