3,024 research outputs found
Automated verification of termination certificates
In order to increase user confidence, many automated theorem provers provide
certificates that can be independently verified. In this paper, we report on
our progress in developing a standalone tool for checking the correctness of
certificates for the termination of term rewrite systems, and formally proving
its correctness in the proof assistant Coq. To this end, we use the extraction
mechanism of Coq and the library on rewriting theory and termination called
CoLoR
On Differentiable Interpreters
Neural networks have transformed the fields of Machine Learning and Artificial Intelligence with the ability to model complex features and behaviours from raw data. They quickly became instrumental models, achieving numerous state-of-the-art performances across many tasks and domains. Yet the successes of these models often rely on large amounts of data. When data is scarce, resourceful ways of using background knowledge often help. However, though different types of background knowledge can be used to bias the model, it is not clear how one can use algorithmic knowledge to that extent. In this thesis, we present differentiable interpreters as an effective framework for utilising algorithmic background knowledge as architectural inductive biases of neural networks. By continuously approximating discrete elements of traditional program interpreters, we create differentiable interpreters that, due to the continuous nature of their execution, are amenable to optimisation with gradient descent methods. This enables us to write code mixed with parametric functions, where the code strongly biases the behaviour of the model while enabling the training of parameters and/or input representations from data. We investigate two such differentiable interpreters and their use cases in this thesis. First, we present a detailed construction of â4, a differentiable interpreter for the programming language FORTH. We demonstrate the ability of â4 to strongly bias neural models with incomplete programs of variable complexity while learning missing pieces of the program with parametrised neural networks. Such models can learn to solve tasks and strongly generalise to out-of-distribution data from small datasets. Second, we present greedy Neural Theorem Provers (gNTPs), a significant improvement of a differentiable Datalog interpreter NTP. gNTPs ameliorate the large computational cost of recursive differentiable interpretation, achieving drastic time and memory speedups while introducing soft reasoning over logic knowledge and natural language
Formal Verification of Security Protocol Implementations: A Survey
Automated formal verification of security protocols has been mostly focused on analyzing high-level abstract models which, however, are significantly different from real protocol implementations written in programming languages. Recently, some researchers have started investigating techniques that bring automated formal proofs closer to real implementations. This paper surveys these attempts, focusing on approaches that target the application code that implements protocol logic, rather than the libraries that implement cryptography. According to these approaches, libraries are assumed to correctly implement some models. The aim is to derive formal proofs that, under this assumption, give assurance about the application code that implements the protocol logic. The two main approaches of model extraction and code generation are presented, along with the main techniques adopted for each approac
Combining Representation Learning with Logic for Language Processing
The current state-of-the-art in many natural language processing and
automated knowledge base completion tasks is held by representation learning
methods which learn distributed vector representations of symbols via
gradient-based optimization. They require little or no hand-crafted features,
thus avoiding the need for most preprocessing steps and task-specific
assumptions. However, in many cases representation learning requires a large
amount of annotated training data to generalize well to unseen data. Such
labeled training data is provided by human annotators who often use formal
logic as the language for specifying annotations. This thesis investigates
different combinations of representation learning methods with logic for
reducing the need for annotated training data, and for improving
generalization.Comment: PhD Thesis, University College London, Submitted and accepted in 201
Logic-based Technologies for Intelligent Systems: State of the Art and Perspectives
Together with the disruptive development of modern sub-symbolic approaches to artificial intelligence (AI), symbolic approaches to classical AI are re-gaining momentum, as more and more researchers exploit their potential to make AI more comprehensible, explainable, and therefore trustworthy. Since logic-based approaches lay at the core of symbolic AI, summarizing their state of the art is of paramount importance now more than ever, in order to identify trends, benefits, key features, gaps, and limitations of the techniques proposed so far, as well as to identify promising research perspectives. Along this line, this paper provides an overview of logic-based approaches and technologies by sketching their evolution and pointing out their main application areas. Future perspectives for exploitation of logic-based technologies are discussed as well, in order to identify those research fields that deserve more attention, considering the areas that already exploit logic-based approaches as well as those that are more likely to adopt logic-based approaches in the future
Automatic construction and updating of knowledge base from log data
Large software systems can be very complex, and they get more and more complex in the recently popular Microservice Architecture due to increasing interactions among more components in bigger systems. Plain model-based and data-driven diagnosis approaches can be used for fault detection, but they are usually opaque and demand massive computing power. On the other hand, knowledge-based methods have shown to be not only effective but explainable and human-friendly for various tasks such as Fault Analysis, but are dependent on having a knowledge base. The construction and maintenance of knowledge bases are not a trivial problem, which is referred to as the knowledge bottleneck.
Software system logs are the primary and most available, sometimes the only available data that record system runtime information, which are critical for software system Operation and Maintenance (O\&M). I proposed the TREAT framework, which can automate the construction and update a knowledge base from a continual stream of logs, which aims to, as faithfully as possible, reflect the latest states of the assisted software system, and facilitate downstream tasks, typically fault localisation. To the best of our knowledge, this is the first effort to construct a fully automated ever-updating knowledge base from logs that aims at reflecting the internal changing states of a software system. To evaluate the TREAT framework, I devised a knowledge-based solution involving logic programming and inductive logic programming that makes use of a TREAT-powered knowledge base to fault localisation and conducted empirical experiments of this solution on a real-life 5G network test bed system.
Since evaluating the TREAT framework by fault localisation is indirect and involves many confounding factors, e.g., the specific solution to fault localisation, I explored and came up with a novel method called LP-Measure that can directly assess the quality of a given knowledge base, in particular the robustness and redundancy of a knowledge graph.
Besides, it was observed that although the extracted knowledge is of high quality in general, there are also errors in the knowledge extraction process. I surveyed the way to quantify the uncertainty during the knowledge extraction process and assign probabilities of correct extraction to every piece of knowledge, which led to a deep investigation into probability calibration and knowledge graph embeddings, specifically testing and confirming the phenomenon of uncalibrated probabilities in knowledge graph embeddings and how to choose specific calibration models from the existing toolbox
Explainable methods for knowledge graph refinement and exploration via symbolic reasoning
Knowledge Graphs (KGs) have applications in many domains such as Finance, Manufacturing, and Healthcare. While recent efforts have created large KGs, their content is far from complete and sometimes includes invalid statements. Therefore, it is crucial to refine the constructed KGs to enhance their coverage and accuracy via KG completion and KG validation. It is also vital to provide human-comprehensible explanations for such refinements, so that humans have trust in the KG quality. Enabling KG exploration, by search and browsing, is also essential for users to understand the KG value and limitations towards down-stream applications. However, the large size of KGs makes KG exploration very challenging. While the type taxonomy of KGs is a useful asset along these lines, it remains insufficient for deep exploration. In this dissertation we tackle the aforementioned challenges of KG refinement and KG exploration by combining logical reasoning over the KG with other techniques such as KG embedding models and text mining. Through such combination, we introduce methods that provide human-understandable output. Concretely, we introduce methods to tackle KG incompleteness by learning exception-aware rules over the existing KG. Learned rules are then used in inferring missing links in the KG accurately. Furthermore, we propose a framework for constructing human-comprehensible explanations for candidate facts from both KG and text. Extracted explanations are used to insure the validity of KG facts. Finally, to facilitate KG exploration, we introduce a method that combines KG embeddings with rule mining to compute informative entity clusters with explanations.Wissensgraphen haben viele Anwendungen in verschiedenen Bereichen, beispielsweise im Finanz- und Gesundheitswesen. Wissensgraphen sind jedoch unvollständig und enthalten auch ungĂźltige Daten. Hohe Abdeckung und Korrektheit erfordern neue Methoden zur Wissensgraph-Erweiterung und Wissensgraph-Validierung. Beide Aufgaben zusammen werden als Wissensgraph-Verfeinerung bezeichnet. Ein wichtiger Aspekt dabei ist die Erklärbarkeit und Verständlichkeit von Wissensgraphinhalten fĂźr Nutzer. In Anwendungen ist darĂźber hinaus die nutzerseitige Exploration von Wissensgraphen von besonderer Bedeutung. Suchen und Navigieren im Graph hilft dem Anwender, die Wissensinhalte und ihre Limitationen besser zu verstehen. Aufgrund der riesigen Menge an vorhandenen Entitäten und Fakten ist die Wissensgraphen-Exploration eine Herausforderung. Taxonomische Typsystem helfen dabei, sind jedoch fĂźr tiefergehende Exploration nicht ausreichend. Diese Dissertation adressiert die Herausforderungen der Wissensgraph-Verfeinerung und der Wissensgraph-Exploration durch algorithmische Inferenz Ăźber dem Wissensgraph. Sie erweitert logisches Schlussfolgern und kombiniert es mit anderen Methoden, insbesondere mit neuronalen Wissensgraph-Einbettungen und mit Text-Mining. Diese neuen Methoden liefern Ausgaben mit Erklärungen fĂźr Nutzer. Die Dissertation umfasst folgende Beiträge: Insbesondere leistet die Dissertation folgende Beiträge: ⢠Zur Wissensgraph-Erweiterung präsentieren wir ExRuL, eine Methode zur Revision von Horn-Regeln durch HinzufĂźgen von Ausnahmebedingungen zum Rumpf der Regeln. Die erweiterten Regeln kĂśnnen neue Fakten inferieren und somit LĂźcken im Wissensgraphen schlieĂen. Experimente mit groĂen Wissensgraphen zeigen, dass diese Methode Fehler in abgeleiteten Fakten erheblich reduziert und nutzerfreundliche Erklärungen liefert. ⢠Mit RuLES stellen wir eine Methode zum Lernen von Regeln vor, die auf probabilistischen Repräsentationen fĂźr fehlende Fakten basiert. Das Verfahren erweitert iterativ die aus einem Wissensgraphen induzierten Regeln, indem es neuronale Wissensgraph-Einbettungen mit Informationen aus Textkorpora kombiniert. Bei der Regelgenerierung werden neue Metriken fĂźr die Regelqualität verwendet. Experimente zeigen, dass RuLES die Qualität der gelernten Regeln und ihrer Vorhersagen erheblich verbessert. ⢠Zur UnterstĂźtzung der Wissensgraph-Validierung wird ExFaKT vorgestellt, ein Framework zur Konstruktion von Erklärungen fĂźr Faktkandidaten. Die Methode transformiert Kandidaten mit Hilfe von Regeln in eine Menge von Aussagen, die leichter zu finden und zu validieren oder widerlegen sind. Die Ausgabe von ExFaKT ist eine Menge semantischer Evidenzen fĂźr Faktkandidaten, die aus Textkorpora und dem Wissensgraph extrahiert werden. Experimente zeigen, dass die Transformationen die Ausbeute und Qualität der entdeckten Erklärungen deutlich verbessert. Die generierten unterstĂźtzen Erklärungen unterstĂźtze sowohl die manuelle Wissensgraph- Validierung durch Kuratoren als auch die automatische Validierung. ⢠Zur UnterstĂźtzung der Wissensgraph-Exploration wird ExCut vorgestellt, eine Methode zur Erzeugung von informativen Entitäts-Clustern mit Erklärungen unter Verwendung von Wissensgraph-Einbettungen und automatisch induzierten Regeln. Eine Cluster-Erklärung besteht aus einer Kombination von Relationen zwischen den Entitäten, die den Cluster identifizieren. ExCut verbessert gleichzeitig die Cluster- Qualität und die Cluster-Erklärbarkeit durch iteratives Verschränken des Lernens von Einbettungen und Regeln. Experimente zeigen, dass ExCut Cluster von hoher Qualität berechnet und dass die Cluster-Erklärungen fĂźr Nutzer informativ sind
NLProlog: Reasoning with Weak Unification for Question Answering in Natural Language
Rule-based models are attractive for various tasks because they inherently
lead to interpretable and explainable decisions and can easily incorporate
prior knowledge. However, such systems are difficult to apply to problems
involving natural language, due to its linguistic variability. In contrast,
neural models can cope very well with ambiguity by learning distributed
representations of words and their composition from data, but lead to models
that are difficult to interpret. In this paper, we describe a model combining
neural networks with logic programming in a novel manner for solving multi-hop
reasoning tasks over natural language. Specifically, we propose to use a Prolog
prover which we extend to utilize a similarity function over pretrained
sentence encoders. We fine-tune the representations for the similarity function
via backpropagation. This leads to a system that can apply rule-based reasoning
to natural language, and induce domain-specific rules from training data. We
evaluate the proposed system on two different question answering tasks, showing
that it outperforms two baselines -- BIDAF (Seo et al., 2016a) and FAST QA
(Weissenborn et al., 2017b) on a subset of the WikiHop corpus and achieves
competitive results on the MedHop data set (Welbl et al., 2017).Comment: ACL 201
Acquiring Word-Meaning Mappings for Natural Language Interfaces
This paper focuses on a system, WOLFIE (WOrd Learning From Interpreted
Examples), that acquires a semantic lexicon from a corpus of sentences paired
with semantic representations. The lexicon learned consists of phrases paired
with meaning representations. WOLFIE is part of an integrated system that
learns to transform sentences into representations such as logical database
queries. Experimental results are presented demonstrating WOLFIE's ability to
learn useful lexicons for a database interface in four different natural
languages. The usefulness of the lexicons learned by WOLFIE are compared to
those acquired by a similar system, with results favorable to WOLFIE. A second
set of experiments demonstrates WOLFIE's ability to scale to larger and more
difficult, albeit artificially generated, corpora. In natural language
acquisition, it is difficult to gather the annotated data needed for supervised
learning; however, unannotated data is fairly plentiful. Active learning
methods attempt to select for annotation and training only the most informative
examples, and therefore are potentially very useful in natural language
applications. However, most results to date for active learning have only
considered standard classification tasks. To reduce annotation effort while
maintaining accuracy, we apply active learning to semantic lexicons. We show
that active learning can significantly reduce the number of annotated examples
required to achieve a given level of performance
- âŚ