404 research outputs found
An extensive English language bibliography on graph theory and its applications
Bibliography on graph theory and its application
A Review of Off-Policy Evaluation in Reinforcement Learning
Reinforcement learning (RL) is one of the most vibrant research frontiers in
machine learning and has been recently applied to solve a number of challenging
problems. In this paper, we primarily focus on off-policy evaluation (OPE), one
of the most fundamental topics in RL. In recent years, a number of OPE methods
have been developed in the statistics and computer science literature. We
provide a discussion on the efficiency bound of OPE, some of the existing
state-of-the-art OPE methods, their statistical properties and some other
related research directions that are currently actively explored.Comment: Still under revisio
Why Functionalism Is a Form of ‘Token-Dualism’
We present a novel reductive theory of type-identity physicalism (called Flat Physicalism), which is inspired by the foundations of statistical mechanics as a general theory of natural kinds. We show that all the claims mounted against type-identity physicalism in the literature don’t apply to Flat Physicalism, and moreover that this reductive theory solves many of the problems faced by the various non-reductive approaches including functionalism. In particular, we show that Flat Physicalism can account for the (alleged) appearance of multiple realizability in the special sciences, and that it gives a novel account of the genuine autonomy of the kinds and laws in the special sciences. We further show that the thesis of genuine multiple realization, which is compatible with all forms of non-reductive approaches including functionalism, implies what we call token-dualism; namely the idea that in every token (that partakes in this multiple realization) there are non-physical facts, which may either be non-physical properties or some non-physical substance. In other words, we prove that non-reductive kinds necessarily assume non-reductive tokens, i.e., token dualism. Finally, we show that all forms of non-reductive approaches including functionalism imply a literally multi-leveled structure of reality
Formal Methods for Autonomous Systems
Formal methods refer to rigorous, mathematical approaches to system
development and have played a key role in establishing the correctness of
safety-critical systems. The main building blocks of formal methods are models
and specifications, which are analogous to behaviors and requirements in system
design and give us the means to verify and synthesize system behaviors with
formal guarantees.
This monograph provides a survey of the current state of the art on
applications of formal methods in the autonomous systems domain. We consider
correct-by-construction synthesis under various formulations, including closed
systems, reactive, and probabilistic settings. Beyond synthesizing systems in
known environments, we address the concept of uncertainty and bound the
behavior of systems that employ learning using formal methods. Further, we
examine the synthesis of systems with monitoring, a mitigation technique for
ensuring that once a system deviates from expected behavior, it knows a way of
returning to normalcy. We also show how to overcome some limitations of formal
methods themselves with learning. We conclude with future directions for formal
methods in reinforcement learning, uncertainty, privacy, explainability of
formal methods, and regulation and certification
Constructive formal methods and protocol standardization
This research is part of the NWO project "Improving the Quality of Protocol Standards". In this project we have cooperated with industrial standardization committees that are developing protocol standards. Thus we have contributed to these international standards, and we have generated relevant research questions in the field of formal methods. The first part of this thesis is related to the ISO/IEEE 1073.2 standard, which addresses medical device communication. The protocols in this standard were developed from a couple of MSC scenarios that describe typical intended behavior. Upon synthesizing a protocol from such scenarios, interference between these scenarios may be introduced, which leads to undesired behaviors. This is called the realizability problem. To address the realizability problem, we have introduced a formal framework that is based on partial orders. In this way the problem that causes the interference can be clearly pointed out. We have provided a complete characterization of realizability criteria that can be used to determine whether interference problems are to be expected. Moreover, we have provided a new constructive approach to solve the undesired interference in practical situations. These techniques have been used to improve the protocol standard under consideration. The second part of this thesis is related to the IEEE 1394.1-2004 standard, which addresses High Performance Serial Bus Bridges. This is an extension of the IEEE 1394-1995 standard, also known as FireWire. The development of the distributed spanning tree algorithm turned out to be a serious problem. To address this problem, we have first developed and proposed a much simpler algorithm. We have also studied the algorithm proposed by the developers of the standard, namely by formally reconstructing a version of it, starting from the specification. Such a constructive approach to verification and analysis uses mathematical techniques, or formal methods, to reveal the essential mechanisms that play a role in the algorithm. We have shown the need for different levels of abstraction, and we have illustrated that the algorithm is in fact distributed at two levels. These techniques are usually applied manually, but we have also developed an approach to automate parts of it using state-of-the-art theorem provers
Revisiting the Linear-Programming Framework for Offline RL with General Function Approximation
Offline reinforcement learning (RL) concerns pursuing an optimal policy for
sequential decision-making from a pre-collected dataset, without further
interaction with the environment. Recent theoretical progress has focused on
developing sample-efficient offline RL algorithms with various relaxed
assumptions on data coverage and function approximators, especially to handle
the case with excessively large state-action spaces. Among them, the framework
based on the linear-programming (LP) reformulation of Markov decision processes
has shown promise: it enables sample-efficient offline RL with function
approximation, under only partial data coverage and realizability assumptions
on the function classes, with favorable computational tractability. In this
work, we revisit the LP framework for offline RL, and advance the existing
results in several aspects, relaxing certain assumptions and achieving optimal
statistical rates in terms of sample size. Our key enabler is to introduce
proper constraints in the reformulation, instead of using any regularization as
in the literature, sometimes also with careful choices of the function classes
and initial state distributions. We hope our insights further advocate the
study of the LP framework, as well as the induced primal-dual minimax
optimization, in offline RL.Comment: 30 page
- …