59 research outputs found

    Evaluating Symbolic AI as a Tool to Understand Cell Signalling

    Get PDF
    The diverse and highly complex nature of modern phosphoproteomics research produces a high volume of data. Chemical phosphoproteomics especially, is amenable to a variety of analytical approaches. In this thesis we evaluate novel Symbolic AI based algorithms as potential tools in the analysis of cell signalling. Initially we developed a first order deductive, logic-based model. This allowed us to identify previously unreported inhibitor-kinase relationships which could offer novel therapeutic targets for further investigation. Following this we made use of the probabilistic reasoning of ProbLog to augment the aforementioned Prolog based model with an intuitively calculated degree of belief. This allowed us to rank previous associations while also further increasing our confidence in already established predictions. Finally we applied our methodology to a Saccharomyces cerevisiae gene perturbation, phosphoproteomics dataset. In this context we were able to confirm the majority of ground truths, i.e. gene deletions as having taken place as intended. For the remaining deletions, again using a purely symbolic based approach we were able to provide predictions on the rewiring of kinase based signalling networks following kinase encoding gene deletions. The explainable, human readable and white-box nature of this approach were highlighted, however its brittleness due to missing, inconsistent or conflicting background knowledge was also examined

    Reasoning on Feature Models: Compilation-Based vs. Direct Approaches

    Full text link
    Analyzing a Feature Model (FM) and reasoning on the corresponding configuration space is a central task in Software Product Line (SPL) engineering. Problems such as deciding the satisfiability of the FM and eliminating inconsistent parts of the FM have been well resolved by translating the FM into a conjunctive normal form (CNF) formula, and then feeding the CNF to a SAT solver. However, this approach has some limits for other important reasoning issues about the FM, such as counting or enumerating configurations. Two mainstream approaches have been investigated in this direction: (i) direct approaches, using tools based on the CNF representation of the FM at hand, or (ii) compilation-based approaches, where the CNF representation of the FM has first been translated into another representation for which the reasoning queries are easier to address. Our contribution is twofold. First, we evaluate how both approaches compare when dealing with common reasoning operations on FM, namely counting configurations, pointing out one or several configurations, sampling configurations, and finding optimal configurations regarding a utility function. Our experimental results show that the compilation-based is efficient enough to possibly compete with the direct approaches and that the cost of translation (i.e., the compilation time) can be balanced when addressing sufficiently many complex reasoning operations on large configuration spaces. Second, we provide a Java-based automated reasoner that supports these operations for both approaches, thus eliminating the burden of selecting the appropriate tool and approach depending on the operation one wants to perform

    Generalising weighted model counting

    Get PDF
    Given a formula in propositional or (finite-domain) first-order logic and some non-negative weights, weighted model counting (WMC) is a function problem that asks to compute the sum of the weights of the models of the formula. Originally used as a flexible way of performing probabilistic inference on graphical models, WMC has found many applications across artificial intelligence (AI), machine learning, and other domains. Areas of AI that rely on WMC include explainable AI, neural-symbolic AI, probabilistic programming, and statistical relational AI. WMC also has applications in bioinformatics, data mining, natural language processing, prognostics, and robotics. In this work, we are interested in revisiting the foundations of WMC and considering generalisations of some of the key definitions in the interest of conceptual clarity and practical efficiency. We begin by developing a measure-theoretic perspective on WMC, which suggests a new and more general way of defining the weights of an instance. This new representation can be as succinct as standard WMC but can also expand as needed to represent less-structured probability distributions. We demonstrate the performance benefits of the new format by developing a novel WMC encoding for Bayesian networks. We then show how existing WMC encodings for Bayesian networks can be transformed into this more general format and what conditions ensure that the transformation is correct (i.e., preserves the answer). Combining the strengths of the more flexible representation with the tricks used in existing encodings yields further efficiency improvements in Bayesian network probabilistic inference. Next, we turn our attention to the first-order setting. Here, we argue that the capabilities of practical model counting algorithms are severely limited by their inability to perform arbitrary recursive computations. To enable arbitrary recursion, we relax the restrictions that typically accompany domain recursion and generalise circuits (used to express a solution to a model counting problem) to graphs that are allowed to have cycles. These improvements enable us to find efficient solutions to counting fundamental structures such as injections and bijections that were previously unsolvable by any available algorithm. The second strand of this work is concerned with synthetic data generation. Testing algorithms across a wide range of problem instances is crucial to ensure the validity of any claim about one algorithm’s superiority over another. However, benchmarks are often limited and fail to reveal differences among the algorithms. First, we show how random instances of probabilistic logic programs (that typically use WMC algorithms for inference) can be generated using constraint programming. We also introduce a new constraint to control the independence structure of the underlying probability distribution and provide a combinatorial argument for the correctness of the constraint model. This model allows us to, for the first time, experimentally investigate inference algorithms on more than just a handful of instances. Second, we introduce a random model for WMC instances with a parameter that influences primal treewidth—the parameter most commonly used to characterise the difficulty of an instance. We show that the easy-hard-easy pattern with respect to clause density is different for algorithms based on dynamic programming and algebraic decision diagrams than for all other solvers. We also demonstrate that all WMC algorithms scale exponentially with respect to primal treewidth, although at differing rates

    LIPIcs, Volume 261, ICALP 2023, Complete Volume

    Get PDF
    LIPIcs, Volume 261, ICALP 2023, Complete Volum

    Circuit Testing Based on Fuzzy Sampling with BDD Bases

    Get PDF
    Fuzzy testing of integrated circuits is an established technique. Current approaches generate an approximately uniform random sample from a translation of the circuit to Boolean logic. These approaches have serious scalability issues, which become more pressing with the ever-increasing size of circuits. We propose using a base of binary decision diagrams to sample the translations as a soft computing approach. Uniformity is guaranteed by design and scalability is greatly improved. We test our approach against five other state-of-the-art tools and find our tool to outperform all of them, both in terms of performance and scalability

    Generating Random Instances of Weighted Model Counting:An Empirical Analysis with Varying Primal Treewidth

    Get PDF

    IASCAR: Incremental Answer Set Counting by Anytime Refinement

    Full text link
    Answer set programming (ASP) is a popular declarative programming paradigm with various applications. Programs can easily have many answer sets that cannot be enumerated in practice, but counting still allows quantifying solution spaces. If one counts under assumptions on literals, one obtains a tool to comprehend parts of the solution space, so-called answer set navigation. However, navigating through parts of the solution space requires counting many times, which is expensive in theory. Knowledge compilation compiles instances into representations on which counting works in polynomial time. However, these techniques exist only for CNF formulas, and compiling ASP programs into CNF formulas can introduce an exponential overhead. This paper introduces a technique to iteratively count answer sets under assumptions on knowledge compilations of CNFs that encode supported models. Our anytime technique uses the inclusion-exclusion principle to improve bounds by over- and undercounting systematically. In a preliminary empirical analysis, we demonstrate promising results. After compiling the input (offline phase), our approach quickly (re)counts.Comment: Under consideration in Theory and Practice of Logic Programming (TPLP

    Proof Complexity of Propositional Model Counting

    Get PDF
    Recently, the proof system MICE for the model counting problem #SAT was introduced by Fichte, Hecher and Roland (SAT\u2722). As demonstrated by Fichte et al., the system MICE can be used for proof logging for state-of-the-art #SAT solvers. We perform a proof-complexity study of MICE. For this we first simplify the rules of MICE and obtain a calculus MICE\u27 that is polynomially equivalent to MICE. Our main result establishes an exponential lower bound for the number of proof steps in MICE\u27 (and hence also in MICE) for a specific family of CNFs

    Certified Knowledge Compilation with Application to Verified Model Counting

    Get PDF
    Computing many useful properties of Boolean formulas, such as their weighted or unweighted model count, is intractable on general representations. It can become tractable when formulas are expressed in a special form, such as the decision-decomposable, negation normal form (dec-DNNF) . Knowledge compilation is the process of converting a formula into such a form. Unfortunately existing knowledge compilers provide no guarantee that their output correctly represents the original formula, and therefore they cannot validate a model count, or any other computed value. We present Partitioned-Operation Graphs (POGs), a form that can encode all of the representations used by existing knowledge compilers. We have designed CPOG, a framework that can express proofs of equivalence between a POG and a Boolean formula in conjunctive normal form (CNF). We have developed a program that generates POG representations from dec-DNNF graphs produced by the state-of-the-art knowledge compiler D4, as well as checkable CPOG proofs certifying that the output POGs are equivalent to the input CNF formulas. Our toolchain for generating and verifying POGs scales to all but the largest graphs produced by D4 for formulas from a recent model counting competition. Additionally, we have developed a formally verified CPOG checker and model counter for POGs in the Lean 4 proof assistant. In doing so, we proved the soundness of our proof framework. These programs comprise the first formally verified toolchain for weighted and unweighted model counting

    A Quantitative Flavour of Robust Reachability

    Full text link
    Many software analysis techniques attempt to determine whether bugs are reachable, but for security purpose this is only part of the story as it does not indicate whether the bugs found could be easily triggered by an attacker. The recently introduced notion of robust reachability aims at filling this gap by distinguishing the input controlled by the attacker from those that are not. Yet, this qualitative notion may be too strong in practice, leaving apart bugs which are mostly but not fully replicable. We aim here at proposing a quantitative version of robust reachability, more flexible and still amenable to automation. We propose quantitative robustness, a metric expressing how easily an attacker can trigger a bug while taking into account that he can only influence part of the program input, together with a dedicated quantitative symbolic execution technique (QRSE). Interestingly, QRSE relies on a variant of model counting (namely, functional E-MAJSAT) unseen so far in formal verification, but which has been studied in AI domains such as Bayesian network, knowledge representation and probabilistic planning. Yet, the existing solving methods from these fields turn out to be unsatisfactory for formal verification purpose, leading us to propose a novel parametric method. These results have been implemented and evaluated over two security-relevant case studies, allowing to demonstrate the feasibility and relevance of our ideas
    corecore