406 research outputs found
Verifiable and Compositional Reinforcement Learning Systems
We propose a novel framework for verifiable and compositional reinforcement
learning (RL) in which a collection of RL sub-systems, each of which learns to
accomplish a separate sub-task, are composed to achieve an overall task. The
framework consists of a high-level model, represented as a parametric Markov
decision process (pMDP) which is used to plan and to analyze compositions of
sub-systems, and of the collection of low-level sub-systems themselves. By
defining interfaces between the sub-systems, the framework enables automatic
decompositons of task specifications, e.g., reach a target set of states with a
probability of at least 0.95, into individual sub-task specifications, i.e.
achieve the sub-system's exit conditions with at least some minimum
probability, given that its entry conditions are met. This in turn allows for
the independent training and testing of the sub-systems; if they each learn a
policy satisfying the appropriate sub-task specification, then their
composition is guaranteed to satisfy the overall task specification.
Conversely, if the sub-task specifications cannot all be satisfied by the
learned policies, we present a method, formulated as the problem of finding an
optimal set of parameters in the pMDP, to automatically update the sub-task
specifications to account for the observed shortcomings. The result is an
iterative procedure for defining sub-task specifications, and for training the
sub-systems to meet them. As an additional benefit, this procedure allows for
particularly challenging or important components of an overall task to be
determined automatically, and focused on, during training. Experimental results
demonstrate the presented framework's novel capabilities
SOTER: A Runtime Assurance Framework for Programming Safe Robotics Systems
The recent drive towards achieving greater autonomy and intelligence in
robotics has led to high levels of complexity. Autonomous robots increasingly
depend on third party off-the-shelf components and complex machine-learning
techniques. This trend makes it challenging to provide strong design-time
certification of correct operation.
To address these challenges, we present SOTER, a robotics programming
framework with two key components: (1) a programming language for implementing
and testing high-level reactive robotics software and (2) an integrated runtime
assurance (RTA) system that helps enable the use of uncertified components,
while still providing safety guarantees. SOTER provides language primitives to
declaratively construct a RTA module consisting of an advanced,
high-performance controller (uncertified), a safe, lower-performance controller
(certified), and the desired safety specification. The framework provides a
formal guarantee that a well-formed RTA module always satisfies the safety
specification, without completely sacrificing performance by using higher
performance uncertified components whenever safe. SOTER allows the complex
robotics software stack to be constructed as a composition of RTA modules,
where each uncertified component is protected using a RTA module.
To demonstrate the efficacy of our framework, we consider a real-world
case-study of building a safe drone surveillance system. Our experiments both
in simulation and on actual drones show that the SOTER-enabled RTA ensures the
safety of the system, including when untrusted third-party components have bugs
or deviate from the desired behavior
Maximum Causal Entropy Specification Inference from Demonstrations
In many settings (e.g., robotics) demonstrations provide a natural way to
specify tasks; however, most methods for learning from demonstrations either do
not provide guarantees that the artifacts learned for the tasks, such as
rewards or policies, can be safely composed and/or do not explicitly capture
history dependencies. Motivated by this deficit, recent works have proposed
learning Boolean task specifications, a class of Boolean non-Markovian rewards
which admit well-defined composition and explicitly handle historical
dependencies. This work continues this line of research by adapting maximum
causal entropy inverse reinforcement learning to estimate the posteriori
probability of a specification given a multi-set of demonstrations. The key
algorithmic insight is to leverage the extensive literature and tooling on
reduced ordered binary decision diagrams to efficiently encode a time unrolled
Markov Decision Process. This enables transforming a naive exponential time
algorithm into a polynomial time algorithm.Comment: Computer Aided Verification, 202
Towards a Service-Oriented Architecture for Production Planning and Control: A Comprehensive Review and Novel Approach
The trends of shorter product lifecycles, customized products, and volatile market environments require manufacturers to reconfigure their production increasingly frequent to maintain competitiveness and customer satisfaction. More frequent reconfigurations, however, are linked to increased efforts in production planning and control (PPC). This poses a challenge for manufacturers, especially in regard of demographic change and shortage of qualified labour, since many tasks in PPC are performed manually by domain experts. Following the paradigm of software-defined manufacturing, this paper targets to enable a higher degree of automation and interoperability in PPC by applying the concepts of service-oriented architecture. As a result, production planners are empowered to orchestrate tasks in PPC without consideration of underlying implementation details. At first, it is investigated how tasks in PPC can be represented as services with the aim of encapsulation and reusability. Secondly, a software architecture based on asset administration shells is presented that allows connection to production data sources and enables integration and usage of such PPC services. In this sense, an approach for mapping asset administrations shells to OpenAPI Specifications is proposed for interoperable and semantic integration of existing services and legacy systems. Lastly, challenges and potential solutions for data integration are discussed considering the present heterogeneity of data sources in manufacturing
- …