19 research outputs found

    Towards a Relational Tsetlin Machine in Natural Language Processing

    Get PDF
    As Artificial Intelligence (AI) becomes an integral part of everyday life both at the personal and at the societal level, there is a concerted effort to have AI models explain their decisions. Explainable Artificial Intelligence (XAI) aims to increase user trust in AI systems, as well as prevent misuse and perpetuation of bias. One of the ways that humans interact with AI is through natural language text. Since language understanding at the human level requires logical structures, integration of logic programming in natural language processing can be advantageous for natural language processing at the computational level. Due to the multifaceted nature of language, AI language systems have to consider multiple different aspects of each single piece of text. Introducing compositionality via relational modelling can capture such complex information as an aggregation of simpler parts. Like other branches of AI, Natural Language Processing (NLP) also benefits from XAI, where practitioners and end users can confirm that important information or context is not being lost in translation. Tsetlin Machines (TMs) use learning automata to provide interpretable decisions to classification problems. It learns clauses or sub-patterns constructed from the features available to it. In a simple classification task, multiple of these clauses vote to indicate which class a sample belongs to. TMs’ pattern recognition approach have proved successful in variety of image classification and NLP tasks. However, there has been no targeted research into using TMs as a tool of XAI within the aspects of language analysis. In this thesis, using terminology from XAI, we establish that the clauses learnt by a TM, taken collectively, encompass the global description of the task to be solved, and the subset of clauses that decide on a single test sample form a local description of the sample. We then establish that a TM-based system can produce human-interpretable decisions in dialogue-related semantic tasks, including entity identification and semantic relation identification. By comparing with available expert annotations, we document that the global descriptions match to a large degree on such tasks. The local descriptions allow for observation of how the model shifts its focus between different aspects of the text as required. We also exhibit that the TM can build a logical structure for reasoning based on the relationships present in natural language text. While a standard or vanilla TM depends on propositional input and creates propositional clauses to encode its decisions, we present a modified version termed as the Relational Tsetlin Machine (RTM). The RTM works on relations and their included entities, such that theresultant learning is in terms of roles played by entity types. In contrast to the vanilla TM, which uses constants or words from the vocabulary, the RTM uses variables, which allows for creation of generic Horn Clauses that effectively capture logical interactions in the text. The third contribution of this thesis is in the form of a framework that utilizes the TM to overcome the challenges of data changes. Most AI applications require large amount of data in order to perform well. But data can have different characteristics when it comes from different sources or different time instances. This is true for natural language data as well, since language changes both historically and geographically, and even between spoken, written, or on-line usage. We show that a TM-based system can identify differing characteristics in data, isolate the samples that do not conform with the majority and also provide ways to mitigate the effect such samples have on the model performance.publishedVersio

    Efficient Data Fusion using the Tsetlin Machine

    Full text link
    We propose a novel way of assessing and fusing noisy dynamic data using a Tsetlin Machine. Our approach consists in monitoring how explanations in form of logical clauses that a TM learns changes with possible noise in dynamic data. This way TM can recognize the noise by lowering weights of previously learned clauses, or reflect it in the form of new clauses. We also perform a comprehensive experimental study using notably different datasets that demonstrated high performance of the proposed approach

    A Logic-Based Explainable Framework for Relation Classification of Human Rights Violations

    Get PDF
    Using a Relational Tsetlin Machine (RTM) for analysis of semi-structured data allows the use of inherent relational structures present in natural language text to get an explainable classification of data. A finite Herbrand model derives Horn Clauses from the model, which are simple yet powerful logical tools that can build an abstract view of the world. We use the same to analyze human rights violation data. We show concretely how natural language can be transformed into a relational structure, and further use the Relational Tsetlin Machine to not only classify incidents as serious and non-serious violations but also explore the patterns learned by the RTM in order to arrive that those decisions. Furthermore, the distilled Horn Clauses show a precise understanding of the concepts involved without the drawback of textual ambiguity.publishedVersio

    Exploring Effects of Hyperdimensional Vectors for Tsetlin Machines

    Get PDF
    Tsetlin machines (TMs) have been successful in several application domains, operating with high efficiency on Boolean representations of the input data. However, Booleanizing complex data structures such as sequences, graphs, images, signal spectra, chemical compounds, and natural language is non-trivial. In this paper, we propose a hypervector (HV) based method for expressing arbitrarily large sets of concepts associated with any input data. Using a hyperdimensional space to build vectors drastically expands the capacity and flexibility of the TM. We demonstrate how images, chemical compounds, and natural language text are encoded according to the proposed method, and how the resulting HV -powered TM can achieve significantly higher accuracy and faster learning on well-known benchmarks. Our results thus open up a new research direction for TMs, namely how to expand and exploit the benefits of operating in hyperspace, including new booleanization strategies, optimization of TM inference and learning, as well as new TM applications.acceptedVersio

    Massively Parallel and Asynchronous Tsetlin Machine Architecture Supporting Almost Constant-Time Scaling

    Get PDF
    Using logical clauses to represent patterns, Tsetlin Machines (TMs) have recently obtained competitive performance in terms of accuracy, memory footprint, energy, and learning speed on several benchmarks. Each TM clause votes for or against a particular class, with classification resolved using a majority vote. While the evaluation of clauses is fast, being based on binary operators, the voting makes it necessary to synchronize the clause evaluation, impeding parallelization. In this paper, we propose a novel scheme for desynchronizing the evaluation of clauses, eliminating the voting bottleneck. In brief, every clause runs in its own thread for massive native parallelism. For each training example, we keep track of the class votes obtained from the clauses in local voting tallies. The local voting tallies allow us to detach the processing of each clause from the rest of the clauses, supporting decentralized learning. This means that the TM most of the time will operate on outdated voting tallies. We evaluated the proposed parallelization across diverse learning tasks and it turns out that our decentralized TM learning algorithm copes well with working on outdated data, resulting in no significant loss in learning accuracy. Furthermore, we show that the proposed approach provides up to 50 times faster learning. Finally, learning time is almost constant for reasonable clause amounts (employing from 20 to 7,000 clauses on a Tesla V100 GPU). For sufficiently large clause numbers, computation time increases approximately proportionally. Our parallel and asynchronous architecture thus allows processing of massive datasets and operating with more clauses for higher accuracy.Comment: Accepted to ICML 202

    Exploring Effects of Hyperdimensional Vectors for Tsetlin Machines

    Full text link
    Tsetlin machines (TMs) have been successful in several application domains, operating with high efficiency on Boolean representations of the input data. However, Booleanizing complex data structures such as sequences, graphs, images, signal spectra, chemical compounds, and natural language is not trivial. In this paper, we propose a hypervector (HV) based method for expressing arbitrarily large sets of concepts associated with any input data. Using a hyperdimensional space to build vectors drastically expands the capacity and flexibility of the TM. We demonstrate how images, chemical compounds, and natural language text are encoded according to the proposed method, and how the resulting HV-powered TM can achieve significantly higher accuracy and faster learning on well-known benchmarks. Our results open up a new research direction for TMs, namely how to expand and exploit the benefits of operating in hyperspace, including new booleanization strategies, optimization of TM inference and learning, as well as new TM applications.Comment: 9 pages, 17 figure

    Building Concise Logical Patterns by Constraining Tsetlin Machine Clause Size

    Get PDF
    Author's accepted manuscript.Tsetlin machine (TM) is a logic-based machine learning approach with the crucial advantages of being transparent and hardware-friendly. While TMs match or surpass deep learning accuracy for an increasing number of applications, large clause pools tend to produce clauses with many literals (long clauses). As such, they become less interpretable. Further, longer clauses increase the switching activity of the clause logic in hardware, consuming more power. This paper introduces a novel variant of TM learning – Clause Size Constrained TMs (CSC-TMs) – where one can set a soft constraint on the clause size. As soon as a clause includes more literals than the constraint allows, it starts expelling literals. Accordingly, oversized clauses only appear transiently. To evaluate CSC-TM, we conduct classifcation, clustering, and regression experiments on tabular data, natural language text, images, and board games. Our results show that CSC-TM maintains accuracy with up to 80 times fewer literals. Indeed, the accuracy increases with shorter clauses for TREC, IMDb, and BBC Sports. After the accuracy peaks, it drops gracefully as the clause size approaches a single literal. We fnally analyze CSC-TM power consumption and derive new convergence properties.acceptedVersio

    Brain Tumor Segmentation from Multimodal MR Images Using Rough Sets

    Full text link
    corecore