485 research outputs found

    FastLAS: scalable inductive logic programming incorporating domain-specific optimisation criteria

    Get PDF
    Inductive Logic Programming (ILP) systems aim to find a setof logical rules, called a hypothesis, that explain a set of ex-amples. In cases where many such hypotheses exist, ILP sys-tems often bias towards shorter solutions, leading to highlygeneral rules being learned. In some application domains likesecurity and access control policies, this bias may not be de-sirable, as when data is sparse more specific rules that guaran-tee tighter security should be preferred. This paper presents anew general notion of ascoring functionover hypotheses thatallows a user to express domain-specific optimisation criteria.This is incorporated into a new ILP system, calledFastLAS,that takes as input a learning task and a customised scoringfunction, and computes an optimal solution with respect tothe given scoring function. We evaluate the accuracy of Fast-LAS over real-world datasets for access control policies andshow that varying the scoring function allows a user to tar-get domain-specific performance metrics. We also compareFastLAS to state-of-the-art ILP systems, using the standardILP bias for shorter solutions, and demonstrate that FastLASis significantly faster and more scalable

    Learning logic rules from text using statistical methods for natural language processing

    Get PDF
    The field of Natural Language Processing (NLP) examines how computers can be made to do beneficial tasks by understanding the natural language. The foundations of NLP are diverse and include scientific fields such as electrical and electronic engineering, linguistics, and artificial intelligence. Some popular NLP applications are information extraction, machine translation, text summarization, and question answering. This dissertation proposes a new methodology using Answer Set programming (ASP) as our main formalism to predict Interpretable Semantic Textual Similarity (iSTS) with a rule-based approach focusing on hard-coded rules for our system, Inspire. We next propose an intelligent rule learning methodology using Inductive Logic Programming (ILP) and modify the ILP-tool eXtended Hyrbid Abductive Inductive Learning (XHAIL) in order to test if we are able to learn the ASP-based rules that were hard-coded earlier on the chunking subtask of the Inspire system. Chunking is the identification of short phrases such as noun phrases which mainly rely on Part-of-Speech (POS) tags. We next evaluate our results using real data sets obtained from the SemEval2016 Task-2 iSTS competition to work with a real application which could be evaluated objectively using the test-sets provided by experts. The Inspire system participated at the SemEval2016 Task-2 iSTS competition in the subtasks of predicting chunk similarity alignments for gold chunks and system generated chunks for three different Datasets. The Inspire system extended the basic ideas from SemEval2015 iSTS Task participant NeRoSim, by realising the rules in logic programming and obtaining the result with an Answer Set Solver. To prepare the input for the logic program, the PunktTokenizer, Word2Vec, and WordNet APIs of NLTK, and the Part-of-Speech (POS) and Named-Entity-Recognition (NER) taggers from Stanford CoreNLP were used. For the chunking subtask, a joint POS-tagger and dependency parser were used based on which an Answer Set program determined chunks. The Inspire system ranked third place overall and first place in one of the competition datasets in the gold chunk subtask. For the above mentioned system, we decided to automate the sentence chunking process by learning the ASP rules using a statistical logical method which combines rule-based and statistical artificial intelligence methods, namely ILP. ILP has been applied to a variety of NLP problems some of which include parsing, information extraction, and question answering. XHAIL, is the ILP-tool we used that aims at generating a hypothesis, which is a logic program, from given background knowledge and examples of structured knowledge based on information provided by the POS-tags One of the main challenges was to extend the XHAIL algorithm for ILP which is based on ASP. With respect to processing natural language, ILP can cater for the constant change in how language is used on a daily basis. At the same time, ILP does not require huge amounts of training examples such as other statistical methods and produces interpretable results, that means a set of rules, which can be analysed and tweaked if necessary. As contributions XHAIL was extended with (i) a pruning mechanism within the hypothesis generalisation algorithm which enables learning from larger datasets, (ii) a better usage of modern solver technology using recently developed optimisation methods, and (iii) a time budget that permits the usage of suboptimal results. These improvements were evaluated on the subtask of sentence chunking using the same three datasets obtained from the SemEval2016 Task-2 competition. Results show that these improvements allow for learning on bigger datasets with results that are of similar quality to state-of-the-art systems on the same task. Moreover, the hypotheses obtained from individual datasets were compared to each other to gain insights on the structure of each dataset. Using ILP to extend our Inspire system not only automates the process of chunking the sentences but also provides us with interpretable models that are useful for providing a deeper understanding of the data being used and how it can be manipulated, which is a feature that is absent in popular Machine Learning methods

    Multi-objective optimisation of safety-critical hierarchical systems

    Get PDF
    Achieving high reliability, particularly in safety critical systems, is an important and often mandatory requirement. At the same time costs should be kept as low as possible. Finding an optimum balance between maximising a system's reliability and minimising its cost is a hard combinatorial problem. As the size and complexity of a system increases, so does the scale of the problem faced by the designers. To address these difficulties, meta-heuristics such as Genetic Algorithms and Tabu Search algorithms have been applied in the past for automatically determining the optimal allocation of redundancies in a system as a mechanism for optimising the reliability and cost characteristics of that system. In all cases, simple reliability block diagrams with restrictive assumptions, such as failure independence and limited 2-state failure modes, were used for evaluating the reliability of the candidate designs produced by the various algorithms.This thesis argues that a departure from this restrictive evaluation model is possible by using a new model-based reliability evaluation technique called Hierachically Performed Hazard Origin and Propagation Studies (HiP-HOPS). HiP-HOPS can overcome the limitations imposed by reliability block diagrams by providing automatic analysis of complex engineering models with multiple failure modes. The thesis demonstrates that, used as the fitness evaluating component of a multi-objective Genetic Algorithm, HiP-HOPS can be used to solve the problem of redundancy allocation effectively and with relative efficiency. Furthermore, the ability of HiP-HOPS to model and automatically analyse complex engineering models, with multiple failure modes, allows the Genetic Algorithm to potentially optimise systems using more flexible strategies, not just series-parallel. The results of this thesis show the feasibility of the approach and point to a number of directions for future work to consider

    Search space expansion for efficient incremental inductive logic programming from streamed data

    Get PDF
    In the past decade, several systems for learning Answer Set Programs (ASP) have been proposed, including the recent FastLAS system. Compared to other state-of-the-art approaches to learning ASP, FastLAS is more scalable, as rather than computing the hypothesis space in full, it computes a much smaller subset relative to a given set of examples that is nonetheless guaranteed to contain an optimal solution to the task (called an OPT-sufficient subset). On the other hand, like many other Inductive Logic Programming (ILP) systems, FastLAS is designed to be run on a fixed learning task meaning that if new examples are discovered after learning, the whole process must be run again. In many real applications, data arrives in a stream. Rerunning an ILP system from scratch each time new examples arrive is inefficient. In this paper we address this problem by presenting IncrementalLAS, a system that uses a new technique, called hypothesis space expansion, to enable a FastLAS-like OPT-sufficient subset to be expanded each time new examples are discovered. We prove that this preserves FastLAS's guarantee of finding an optimal solution to the full task (including the new examples), while removing the need to repeat previous computations. Through our evaluation, we demonstrate that running IncrementalLAS on tasks updated with sequences of new examples is significantly faster than re-running FastLAS from scratch on each updated task

    Internet Traffic Engineering : An Artificial Intelligence Approach

    Get PDF
    Dissertação de Mestrado em Ciência de Computadores, apresentada à Faculdade de Ciências da Universidade do Port

    On Differentiable Interpreters

    Get PDF
    Neural networks have transformed the fields of Machine Learning and Artificial Intelligence with the ability to model complex features and behaviours from raw data. They quickly became instrumental models, achieving numerous state-of-the-art performances across many tasks and domains. Yet the successes of these models often rely on large amounts of data. When data is scarce, resourceful ways of using background knowledge often help. However, though different types of background knowledge can be used to bias the model, it is not clear how one can use algorithmic knowledge to that extent. In this thesis, we present differentiable interpreters as an effective framework for utilising algorithmic background knowledge as architectural inductive biases of neural networks. By continuously approximating discrete elements of traditional program interpreters, we create differentiable interpreters that, due to the continuous nature of their execution, are amenable to optimisation with gradient descent methods. This enables us to write code mixed with parametric functions, where the code strongly biases the behaviour of the model while enabling the training of parameters and/or input representations from data. We investigate two such differentiable interpreters and their use cases in this thesis. First, we present a detailed construction of ∂4, a differentiable interpreter for the programming language FORTH. We demonstrate the ability of ∂4 to strongly bias neural models with incomplete programs of variable complexity while learning missing pieces of the program with parametrised neural networks. Such models can learn to solve tasks and strongly generalise to out-of-distribution data from small datasets. Second, we present greedy Neural Theorem Provers (gNTPs), a significant improvement of a differentiable Datalog interpreter NTP. gNTPs ameliorate the large computational cost of recursive differentiable interpretation, achieving drastic time and memory speedups while introducing soft reasoning over logic knowledge and natural language
    • …
    corecore