4,977 research outputs found
Feature Reinforcement Learning: Part I: Unstructured MDPs
General-purpose, intelligent, learning agents cycle through sequences of
observations, actions, and rewards that are complex, uncertain, unknown, and
non-Markovian. On the other hand, reinforcement learning is well-developed for
small finite state Markov decision processes (MDPs). Up to now, extracting the
right state representations out of bare observations, that is, reducing the
general agent setup to the MDP framework, is an art that involves significant
effort by designers. The primary goal of this work is to automate the reduction
process and thereby significantly expand the scope of many existing
reinforcement learning algorithms and the agents that employ them. Before we
can think of mechanizing this search for suitable MDPs, we need a formal
objective criterion. The main contribution of this article is to develop such a
criterion. I also integrate the various parts into one learning algorithm.
Extensions to more realistic dynamic Bayesian networks are developed in Part
II. The role of POMDPs is also considered there.Comment: 24 LaTeX pages, 5 diagram
New developments in the theory of Groebner bases and applications to formal verification
We present foundational work on standard bases over rings and on Boolean
Groebner bases in the framework of Boolean functions. The research was
motivated by our collaboration with electrical engineers and computer
scientists on problems arising from formal verification of digital circuits. In
fact, algebraic modelling of formal verification problems is developed on the
word-level as well as on the bit-level. The word-level model leads to Groebner
basis in the polynomial ring over Z/2n while the bit-level model leads to
Boolean Groebner bases. In addition to the theoretical foundations of both
approaches, the algorithms have been implemented. Using these implementations
we show that special data structures and the exploitation of symmetries make
Groebner bases competitive to state-of-the-art tools from formal verification
but having the advantage of being systematic and more flexible.Comment: 44 pages, 8 figures, submitted to the Special Issue of the Journal of
Pure and Applied Algebr
Solving Set Constraint Satisfaction Problems using ROBDDs
In this paper we present a new approach to modeling finite set domain
constraint problems using Reduced Ordered Binary Decision Diagrams (ROBDDs). We
show that it is possible to construct an efficient set domain propagator which
compactly represents many set domains and set constraints using ROBDDs. We
demonstrate that the ROBDD-based approach provides unprecedented flexibility in
modeling constraint satisfaction problems, leading to performance improvements.
We also show that the ROBDD-based modeling approach can be extended to the
modeling of integer and multiset constraint problems in a straightforward
manner. Since domain propagation is not always practical, we also show how to
incorporate less strict consistency notions into the ROBDD framework, such as
set bounds, cardinality bounds and lexicographic bounds consistency. Finally,
we present experimental results that demonstrate the ROBDD-based solver
performs better than various more conventional constraint solvers on several
standard set constraint problems
Simulating chemistry efficiently on fault-tolerant quantum computers
Quantum computers can in principle simulate quantum physics exponentially
faster than their classical counterparts, but some technical hurdles remain.
Here we consider methods to make proposed chemical simulation algorithms
computationally fast on fault-tolerant quantum computers in the circuit model.
Fault tolerance constrains the choice of available gates, so that arbitrary
gates required for a simulation algorithm must be constructed from sequences of
fundamental operations. We examine techniques for constructing arbitrary gates
which perform substantially faster than circuits based on the conventional
Solovay-Kitaev algorithm [C.M. Dawson and M.A. Nielsen, \emph{Quantum Inf.
Comput.}, \textbf{6}:81, 2006]. For a given approximation error ,
arbitrary single-qubit gates can be produced fault-tolerantly and using a
limited set of gates in time which is or ; with sufficient parallel preparation of ancillas, constant average
depth is possible using a method we call programmable ancilla rotations.
Moreover, we construct and analyze efficient implementations of first- and
second-quantized simulation algorithms using the fault-tolerant arbitrary gates
and other techniques, such as implementing various subroutines in constant
time. A specific example we analyze is the ground-state energy calculation for
Lithium hydride.Comment: 33 pages, 18 figure
Machine learning and its applications in reliability analysis systems
In this thesis, we are interested in exploring some aspects of Machine Learning (ML) and its application in the Reliability Analysis systems (RAs). We begin by investigating some ML paradigms and their- techniques, go on to discuss the possible applications of ML in improving RAs performance, and lastly give guidelines of the architecture of learning RAs. Our survey of ML covers both levels of Neural Network learning and Symbolic learning. In symbolic process learning, five types of learning and their applications are discussed: rote learning, learning from instruction, learning from analogy, learning from examples, and learning from observation and discovery. The Reliability Analysis systems (RAs) presented in this thesis are mainly designed for maintaining plant safety supported by two functions: risk analysis function, i.e., failure mode effect analysis (FMEA) ; and diagnosis function, i.e., real-time fault location (RTFL). Three approaches have been discussed in creating the RAs. According to the result of our survey, we suggest currently the best design of RAs is to embed model-based RAs, i.e., MORA (as software) in a neural network based computer system (as hardware). However, there are still some improvement which can be made through the applications of Machine Learning. By implanting the 'learning element', the MORA will become learning MORA (La MORA) system, a learning Reliability Analysis system with the power of automatic knowledge acquisition and inconsistency checking, and more. To conclude our thesis, we propose an architecture of La MORA
Selected Topics in Network Optimization: Aligning Binary Decision Diagrams for a Facility Location Problem and a Search Method for Dynamic Shortest Path Interdiction
This work deals with three different combinatorial optimization problems: minimizing the total size of a pair of binary decision diagrams (BDDs) under a certain structural property, a variant of the facility location problem, and a dynamic version of the Shortest-Path Interdiction (DSPI) problem. However, these problems all have the following core idea in common: They all stem from representing an optimization problem as a decision diagram. We begin from cases in which such a diagram representation of reasonable size might exist, but finding a small diagram is difficult to achieve. The first problem develops a heuristic for enforcing a structural property for a collection of BDDs, which allows them to be merged into a single one efficiently. In the second problem, we consider a specific combinatorial problem that allows for a natural representation by a pair of BDDs. We use the previous result and ideas developed earlier in the literature to reformulate this problem as a linear program over a single BDD. This approach enables us to obtain sensitivity information, while often enjoying runtimes comparable to a mixed integer program solved with a commercial solver, after we pay the computational overhead of building the diagram (e.g., when re-solving the problem using different costs, but the same graph topology). In the last part, we examine DSPI, for which building the full decision diagram is generally impractical. We formalize the concept of a game tree for the DSPI and design a heuristic based on the idea of building only selected parts of this exponentially-sized decision diagram (which is not binary any more). We use a Monte Carlo Tree Search framework to establish policies that are near optimal. To mitigate the size of the game tree, we leverage previously derived bounds for the DSPI and employ an alpha–beta pruning technique for minimax optimization. We highlight the practicality of these ideas in a series of numerical experiments
- …