41 research outputs found

    Distinguishing Hidden Markov Chains

    Full text link
    Hidden Markov Chains (HMCs) are commonly used mathematical models of probabilistic systems. They are employed in various fields such as speech recognition, signal processing, and biological sequence analysis. We consider the problem of distinguishing two given HMCs based on an observation sequence that one of the HMCs generates. More precisely, given two HMCs and an observation sequence, a distinguishing algorithm is expected to identify the HMC that generates the observation sequence. Two HMCs are called distinguishable if for every ε>0\varepsilon > 0 there is a distinguishing algorithm whose error probability is less than ε\varepsilon. We show that one can decide in polynomial time whether two HMCs are distinguishable. Further, we present and analyze two distinguishing algorithms for distinguishable HMCs. The first algorithm makes a decision after processing a fixed number of observations, and it exhibits two-sided error. The second algorithm processes an unbounded number of observations, but the algorithm has only one-sided error. The error probability, for both algorithms, decays exponentially with the number of processed observations. We also provide an algorithm for distinguishing multiple HMCs. Finally, we discuss an application in stochastic runtime verification.Comment: This is the full version of a LICS'16 pape

    Predictive Runtime Verification of Stochastic Systems

    Get PDF
    Runtime Verification (RV) is the formal analysis of the execution of a system against some properties at runtime. RV is particularly useful for stochastic systems that have a non-zero probability of failure at runtime. The standard RV assumes constructing a monitor that checks only the currently observed execution of the system against the given properties. This dissertation proposes a framework for predictive RV, where the monitor instead checks the current execution with its finite extensions against some property. The extensions are generated using a prediction model, that is built based on execution samples randomly generated from the system. The thesis statement is that predictive RV for stochastic systems is feasible, effective, and useful. The feasibility is demonstrated by providing a framework, called Prevent, that builds a predictive monitor by using trained prediction models to finitely extend an execution path, and computing the probabilities of the extensions that satisfy or violate the given property. The prediction model is trained using statistical learning techniques from independent and identically distributed samples of system executions. The prediction is the result of a quantitative bounded reachability analysis on the product of the prediction model and the automaton specifying the property. The analysis results are computed offline and stored in a lookup table. At runtime the monitor obtains the state of the system on the prediction model based on the observed execution, directly or by approximation, and uses the lookup table to retrieve the computed probability that the system at the current state will satisfy or violate the given property within some finite number of steps. The effectiveness of Prevent is shown by applying abstraction when constructing the prediction model. The abstraction is on the observation space based on extracting the symmetry relation between symbols that have similar probabilities to satisfy a property. The abstraction may introduce nondeterminism in the final model, which is handled by using a hidden state variable when building the prediction model. We also demonstrate that, under the convergence conditions of the learning algorithms, the prediction results from the abstract models are the same as the concrete models. Finally, the usefulness of Prevent is indicated in real-world applications by showing how it can be applied for predicting rare properties, properties with very low but non-zero probability of satisfaction. More specifically, we adjust the training algorithm that uses the samples generated by importance sampling to generate the prediction models for rare properties without increasing the number of samples and without having a negative impact on the prediction accuracy

    Techniques for automated parameter estimation in computational models of probabilistic systems

    Get PDF
    The main contribution of this dissertation is the design of two new algorithms for automatically synthesizing values of numerical parameters of computational models of complex stochastic systems such that the resultant model meets user-specified behavioral specifications. These algorithms are designed to operate on probabilistic systems – systems that, in general, behave differently under identical conditions. The algorithms work using an approach that combines formal verification and mathematical optimization to explore a model\u27s parameter space. The problem of determining whether a model instantiated with a given set of parameter values satisfies the desired specification is first defined using formal verification terminology, and then reformulated in terms of statistical hypothesis testing. Parameter space exploration involves determining the outcome of the hypothesis testing query for each parameter point and is guided using simulated annealing. The first algorithm uses the sequential probability ratio test (SPRT) to solve the hypothesis testing problems, whereas the second algorithm uses an approach based on Bayesian statistical model checking (BSMC). The SPRT-based parameter synthesis algorithm was used to validate that a given model of glucose-insulin metabolism has the capability of representing diabetic behavior by synthesizing values of three parameters that ensure that the glucose-insulin subsystem spends at least 20 minutes in a diabetic scenario. The BSMC-based algorithm was used to discover the values of parameters in a physiological model of the acute inflammatory response that guarantee a set of desired clinical outcomes. These two applications demonstrate how our algorithms use formal verification, statistical hypothesis testing and mathematical optimization to automatically synthesize parameters of complex probabilistic models in order to meet user-specified behavioral propertie

    Parameter Invariant Monitoring for Signal Temporal Logic

    Get PDF
    Signal Temporal Logic (STL) is a prominent specification formalism for real-time systems, and monitoring these specifications, specially when (for different reasons such as learning) behavior of systems can change over time, is quite important. There are three main challenges in this area: (1) full observation of system state is not possible due to noise or nuisance parameters, (2) the whole execution is not available during the monitoring, and (3) computational complexity of monitoring continuous time signals is very high. Although, each of these challenges has been addressed by different works, to the best of our knowledge, no one has addressed them all together. In this paper, we show how to extend any parameter invariant test procedure for single points in time to a parameter invariant test procedure for efficiently monitoring continuous time executions of a system against STL properties. We also show, how to extend probabilistic error guarantee of the input test procedure to a probabilistic error guarantee for the constructed test procedure

    Automated parameter estimation for biological models using Bayesian statistical model checking

    Get PDF
    Background: Probabilistic models have gained widespread acceptance in the systems biology community as a useful way to represent complex biological systems. Such models are developed using existing knowledge of the structure and dynamics of the system, experimental observations, and inferences drawn from statistical analysis of empirical data. A key bottleneck in building such models is that some system variables cannot be measured experimentally. These variables are incorporated into the model as numerical parameters. Determining values of these parameters that justify existing experiments and provide reliable predictions when model simulations are performed is a key research problem. Domain experts usually estimate the values of these parameters by fitting the model to experimental data. Model fitting is usually expressed as an optimization problem that requires minimizing a cost-function which measures some notion of distance between the model and the data. This optimization problem is often solved by combining local and global search methods that tend to perform well for the specific application domain. When some prior information about parameters is available, methods such as Bayesian inference are commonly used for parameter learning. Choosing the appropriate parameter search technique requires detailed domain knowledge and insight into the underlying system. Results: Using an agent-based model of the dynamics of acute inflammation, we demonstrate a novel parameter estimation algorithm by discovering the amount and schedule of doses of bacterial lipopolysaccharide that guarantee a set of observed clinical outcomes with high probability. We synthesized values of twenty-eight unknown parameters such that the parameterized model instantiated with these parameter values satisfies four specifications describing the dynamic behavior of the model. Conclusions: We have developed a new algorithmic technique for discovering parameters in complex stochastic models of biological systems given behavioral specifications written in a formal mathematical logic. Our algorithm uses Bayesian model checking, sequential hypothesis testing, and stochastic optimization to automatically synthesize parameters of probabilistic biological models

    Probably Safe or Live

    Get PDF
    This paper presents a formal characterisation of safety and liveness properties \`a la Alpern and Schneider for fully probabilistic systems. As for the classical setting, it is established that any (probabilistic tree) property is equivalent to a conjunction of a safety and liveness property. A simple algorithm is provided to obtain such property decomposition for flat probabilistic CTL (PCTL). A safe fragment of PCTL is identified that provides a sound and complete characterisation of safety properties. For liveness properties, we provide two PCTL fragments, a sound and a complete one. We show that safety properties only have finite counterexamples, whereas liveness properties have none. We compare our characterisation for qualitative properties with the one for branching time properties by Manolios and Trefler, and present sound and complete PCTL fragments for characterising the notions of strong safety and absolute liveness coined by Sistla

    A Survey of Challenges for Runtime Verification from Advanced Application Domains (Beyond Software)

    Get PDF
    Runtime verification is an area of formal methods that studies the dynamic analysis of execution traces against formal specifications. Typically, the two main activities in runtime verification efforts are the process of creating monitors from specifications, and the algorithms for the evaluation of traces against the generated monitors. Other activities involve the instrumentation of the system to generate the trace and the communication between the system under analysis and the monitor. Most of the applications in runtime verification have been focused on the dynamic analysis of software, even though there are many more potential applications to other computational devices and target systems. In this paper we present a collection of challenges for runtime verification extracted from concrete application domains, focusing on the difficulties that must be overcome to tackle these specific challenges. The computational models that characterize these domains require to devise new techniques beyond the current state of the art in runtime verification

    Formal methods paradigms for estimation and machine learning in dynamical systems

    Get PDF
    Formal methods are widely used in engineering to determine whether a system exhibits a certain property (verification) or to design controllers that are guaranteed to drive the system to achieve a certain property (synthesis). Most existing techniques require a large amount of accurate information about the system in order to be successful. The methods presented in this work can operate with significantly less prior information. In the domain of formal synthesis for robotics, the assumptions of perfect sensing and perfect knowledge of system dynamics are unrealistic. To address this issue, we present control algorithms that use active estimation and reinforcement learning to mitigate the effects of uncertainty. In the domain of cyber-physical system analysis, we relax the assumption that the system model is known and identify system properties automatically from execution data. First, we address the problem of planning the path of a robot under temporal logic constraints (e.g. "avoid obstacles and periodically visit a recharging station") while simultaneously minimizing the uncertainty about the state of an unknown feature of the environment (e.g. locations of fires after a natural disaster). We present synthesis algorithms and evaluate them via simulation and experiments with aerial robots. Second, we develop a new specification language for tasks that require gathering information about and interacting with a partially observable environment, e.g. "Maintain localization error below a certain level while also avoiding obstacles.'' Third, we consider learning temporal logic properties of a dynamical system from a finite set of system outputs. For example, given maritime surveillance data we wish to find the specification that corresponds only to those vessels that are deemed law-abiding. Algorithms for performing off-line supervised and unsupervised learning and on-line supervised learning are presented. Finally, we consider the case in which we want to steer a system with unknown dynamics to satisfy a given temporal logic specification. We present a novel reinforcement learning paradigm to solve this problem. Our procedure gives "partial credit'' for executions that almost satisfy the specification, which can lead to faster convergence rates and produce better solutions when the specification is not satisfiable

    Enhancing the information content of geophysical data for nuclear site characterisation

    Get PDF
    Our knowledge and understanding to the heterogeneous structure and processes occurring in the Earth’s subsurface is limited and uncertain. The above is true even for the upper 100m of the subsurface, yet many processes occur within it (e.g. migration of solutes, landslides, crop water uptake, etc.) are important to human activities. Geophysical methods such as electrical resistivity tomography (ERT) greatly improve our ability to observe the subsurface due to their higher sampling frequency (especially with autonomous time-lapse systems), larger spatial coverage and less invasive operation, in addition to being more cost-effective than traditional point-based sampling. However, the process of using geophysical data for inference is prone to uncertainty. There is a need to better understand the uncertainties embedded in geophysical data and how they translate themselves when they are subsequently used, for example, for hydrological or site management interpretations and decisions. This understanding is critical to maximize the extraction of information in geophysical data. To this end, in this thesis, I examine various aspects of uncertainty in ERT and develop new methods to better use geophysical data quantitatively. The core of the thesis is based on two literature reviews and three papers. In the first review, I provide a comprehensive overview of the use of geophysical data for nuclear site characterization, especially in the context of site clean-up and leak detection. In the second review, I survey the various sources of uncertainties in ERT studies and the existing work to better quantify or reduce them. I propose that the various steps in the general workflow of an ERT study can be viewed as a pipeline for information and uncertainty propagation and suggested some areas have been understudied. One of these areas is measurement errors. In paper 1, I compare various methods to estimate and model ERT measurement errors using two long-term ERT monitoring datasets. I also develop a new error model that considers the fact that each electrode is used to make multiple measurements. In paper 2, I discuss the development and implementation of a new method for geoelectrical leak detection. While existing methods rely on obtaining resistivity images through inversion of ERT data first, the approach described here estimates leak parameters directly from raw ERT data. This is achieved by constructing hydrological models from prior site information and couple it with an ERT forward model, and then update the leak (and other hydrological) parameters through data assimilation. The approach shows promising results and is applied to data from a controlled injection experiment in Yorkshire, UK. The approach complements ERT imaging and provides a new way to utilize ERT data to inform site characterisation. In addition to leak detection, ERT is also commonly used for monitoring soil moisture in the vadose zone, and increasingly so in a quantitative manner. Though both the petrophysical relationships (i.e., choices of appropriate model and parameterization) and the derived moisture content are known to be subject to uncertainty, they are commonly treated as exact and error‐free. In paper 3, I examine the impact of uncertain petrophysical relationships on the moisture content estimates derived from electrical geophysics. Data from a collection of core samples show that the variability in such relationships can be large, and they in turn can lead to high uncertainty in moisture content estimates, and they appear to be the dominating source of uncertainty in many cases. In the closing chapters, I discuss and synthesize the findings in the thesis within the larger context of enhancing the information content of geophysical data, and provide an outlook on further research in this topic
    corecore