8 research outputs found
Scalable Verification of Markov Decision Processes
Markov decision processes (MDP) are useful to model concurrent process
optimisation problems, but verifying them with numerical methods is often
intractable. Existing approximative approaches do not scale well and are
limited to memoryless schedulers. Here we present the basis of scalable
verification for MDPSs, using an O(1) memory representation of
history-dependent schedulers. We thus facilitate scalable learning techniques
and the use of massively parallel verification.Comment: V4: FMDS version, 12 pages, 4 figure
Smart Sampling for Lightweight Verification of Markov Decision Processes
Markov decision processes (MDP) are useful to model optimisation problems in
concurrent systems. To verify MDPs with efficient Monte Carlo techniques
requires that their nondeterminism be resolved by a scheduler. Recent work has
introduced the elements of lightweight techniques to sample directly from
scheduler space, but finding optimal schedulers by simple sampling may be
inefficient. Here we describe "smart" sampling algorithms that can make
substantial improvements in performance.Comment: IEEE conference style, 11 pages, 5 algorithms, 11 figures, 1 tabl
Certified Reinforcement Learning with Logic Guidance
This paper proposes the first model-free Reinforcement Learning (RL)
framework to synthesise policies for unknown, and continuous-state Markov
Decision Processes (MDPs), such that a given linear temporal property is
satisfied. We convert the given property into a Limit Deterministic Buchi
Automaton (LDBA), namely a finite-state machine expressing the property.
Exploiting the structure of the LDBA, we shape a synchronous reward function
on-the-fly, so that an RL algorithm can synthesise a policy resulting in traces
that probabilistically satisfy the linear temporal property. This probability
(certificate) is also calculated in parallel with policy learning when the
state space of the MDP is finite: as such, the RL algorithm produces a policy
that is certified with respect to the property. Under the assumption of finite
state space, theoretical guarantees are provided on the convergence of the RL
algorithm to an optimal policy, maximising the above probability. We also show
that our method produces ''best available'' control policies when the logical
property cannot be satisfied. In the general case of a continuous state space,
we propose a neural network architecture for RL and we empirically show that
the algorithm finds satisfying policies, if there exist such policies. The
performance of the proposed framework is evaluated via a set of numerical
examples and benchmarks, where we observe an improvement of one order of
magnitude in the number of iterations required for the policy synthesis,
compared to existing approaches whenever available.Comment: This article draws from arXiv:1801.08099, arXiv:1809.0782
Approximate planning and verification for large Markov decision processes
International audienc
Computer Aided Verification
This open access two-volume set LNCS 11561 and 11562 constitutes the refereed proceedings of the 31st International Conference on Computer Aided Verification, CAV 2019, held in New York City, USA, in July 2019. The 52 full papers presented together with 13 tool papers and 2 case studies, were carefully reviewed and selected from 258 submissions. The papers were organized in the following topical sections: Part I: automata and timed systems; security and hyperproperties; synthesis; model checking; cyber-physical systems and machine learning; probabilistic systems, runtime techniques; dynamical, hybrid, and reactive systems; Part II: logics, decision procedures; and solvers; numerical programs; verification; distributed systems and networks; verification and invariants; and concurrency
Tools and techniques for analysing the impact of information security
PhD ThesisThe discipline of information security is employed by organisations to protect the confidentiality,
integrity and availability of information, often communicated in the form of
information security policies. A policy expresses rules, constraints and procedures to guard
against adversarial threats and reduce risk by instigating desired and secure behaviour of
those people interacting with information legitimately. To keep aligned with a dynamic threat
landscape, evolving business requirements, regulation updates, and new technologies a policy
must undergo periodic review and change. Chief Information Security Officers (CISOs) are
the main decision makers on information security policies within an organisation. Making
informed policy modifications involves analysing and therefore predicting the impact of those
changes on the success rate of business processes often expressed as workflows. Security
brings an added burden to completing a workflow. Adding a new security constraint may
reduce success rate or even eliminate it if a workflow is always forced to terminate early. This
can increase the chances of employees bypassing or violating a security policy. Removing an
existing security constraint may increase success rate but may may also increase the risk to
security. A lack of suitably aimed impact analysis tools and methodologies for CISOs means
impact analysis is currently a somewhat manual and ambiguous procedure. Analysis can
be overwhelming, time consuming, error prone, and yield unclear results, especially when
workflows are complex, have a large workforce, and diverse security requirements. This
thesis considers the provision of tools and more formal techniques specific to CISOs to help
them analyse the impact modifying a security policy has on the success rate of a workflow.
More precisely, these tools and techniques have been designed to efficiently compare the
impact between two versions of a security policy applied to the same workflow, one before,
the other after a policy modification.
This work focuses on two specific types of security impact analysis. The first is quantitative
in nature, providing a measure of success rate for a security constrained workflow
which must be executed by employees who may be absent at runtime. This work considers
quantifying workflow resiliency which indicates a workflow’s expected success rate assuming
the availability of employees to be probabilistic. New aspects of quantitative resiliency are introduced in the form of workflow metrics, and risk management techniques to manage
workflows that must work with a resiliency below acceptable levels. Defining these risk
management techniques has led to exploring the reduction of resiliency computation time and
analysing resiliency in workflows with choice. The second area of focus is more qualitative,
in terms of facilitating analysis of how people are likely to behave in response to security
and how that behaviour can impact the success rate of a workflow at a task level. Large
amounts of information from disparate sources exists on human behavioural factors in a
security setting which can be aligned with security standards and structured within a single
ontology to form a knowledge base. Consultations with two CISOs have been conducted,
whose responses have driven the implementation of two new tools, one graphical, the other
Web-oriented allowing CISOs and human factors experts to record and incorporate their
knowledge directly within an ontology. The ontology can be used by CISOs to assess the
potential impact of changes made to a security policy and help devise behavioural controls
to manage that impact. The two consulted CISOs have also carried out an evaluation of the
Web-oriented tool.
vii