8 research outputs found
Using Tsetlin Machine to discover interpretable rules in natural language processing applications
publishedVersio
Efficient Data Fusion using the Tsetlin Machine
We propose a novel way of assessing and fusing noisy dynamic data using a
Tsetlin Machine. Our approach consists in monitoring how explanations in form
of logical clauses that a TM learns changes with possible noise in dynamic
data. This way TM can recognize the noise by lowering weights of previously
learned clauses, or reflect it in the form of new clauses. We also perform a
comprehensive experimental study using notably different datasets that
demonstrated high performance of the proposed approach
Massively Parallel and Asynchronous Tsetlin Machine Architecture Supporting Almost Constant-Time Scaling
Using logical clauses to represent patterns, Tsetlin Machines (TMs) have
recently obtained competitive performance in terms of accuracy, memory
footprint, energy, and learning speed on several benchmarks. Each TM clause
votes for or against a particular class, with classification resolved using a
majority vote. While the evaluation of clauses is fast, being based on binary
operators, the voting makes it necessary to synchronize the clause evaluation,
impeding parallelization. In this paper, we propose a novel scheme for
desynchronizing the evaluation of clauses, eliminating the voting bottleneck.
In brief, every clause runs in its own thread for massive native parallelism.
For each training example, we keep track of the class votes obtained from the
clauses in local voting tallies. The local voting tallies allow us to detach
the processing of each clause from the rest of the clauses, supporting
decentralized learning. This means that the TM most of the time will operate on
outdated voting tallies. We evaluated the proposed parallelization across
diverse learning tasks and it turns out that our decentralized TM learning
algorithm copes well with working on outdated data, resulting in no significant
loss in learning accuracy. Furthermore, we show that the proposed approach
provides up to 50 times faster learning. Finally, learning time is almost
constant for reasonable clause amounts (employing from 20 to 7,000 clauses on a
Tesla V100 GPU). For sufficiently large clause numbers, computation time
increases approximately proportionally. Our parallel and asynchronous
architecture thus allows processing of massive datasets and operating with more
clauses for higher accuracy.Comment: Accepted to ICML 202
Using Tsetlin Machine to discover interpretable rules in natural language processing applications
Tsetlin Machines (TM) use finite state machines for learning and propositional logic to represent patterns. The resulting pattern recognition approach captures information in the form of conjunctive clauses, thus facilitating human interpretation. In this work, we propose a TM-based approach to three common natural language processing (NLP) tasks, namely, sentiment analysis, semantic relation categorization and identifying entities in multi-turn dialogues. By performing frequent itemset mining on the TM-produced patterns, we show that we can obtain a global and a local interpretation of the learning, one that mimics existing rule-sets or lexicons. Further, we also establish that our TM based approach does not compromise on accuracy in the quest for interpretability, via comparison with some widely used machine learning techniques. Finally, we introduce the idea of a relational TM, which uses a logic-based framework to further extend the interpretability
Using Tsetlin Machine to discover interpretable rules in natural language processing applications
Tsetlin Machines (TM) use finite state machines for learning and propositional logic to represent patterns. The resulting pattern recognition approach captures information in the form of conjunctive clauses, thus facilitating human interpretation. In this work, we propose a TM-based approach to three common natural language processing (NLP) tasks, namely, sentiment analysis, semantic relation categorization and identifying entities in multi-turn dialogues. By performing frequent itemset mining on the TM-produced patterns, we show that we can obtain a global and a local interpretation of the learning, one that mimics existing rule-sets or lexicons. Further, we also establish that our TM based approach does not compromise on accuracy in the quest for interpretability, via comparison with some widely used machine learning techniques. Finally, we introduce the idea of a relational TM, which uses a logic-based framework to further extend the interpretability
Massively Parallel and Asynchronous Tsetlin Machine Architecture Supporting Almost Constant-Time Scaling
Using logical clauses to represent patterns, Tsetlin Machine (TM) have recently obtained competitive performance in terms of accuracy, memory footprint, energy, and learning speed on several benchmarks. Each TM clause votes for or against a particular class, with classification resolved using a majority vote. While the evaluation of clauses is fast, being based on binary operators, the voting makes it necessary to synchronize the clause evaluation, impeding parallelization. In this paper, we propose a novel scheme for desynchronizing the evaluation of clauses, eliminating the voting bottleneck. In brief, every clause runs in its own thread for massive native parallelism. For each training example, we keep track of the class votes obtained from the clauses in local voting tallies. The local voting tallies allow us to detach the processing of each clause from the rest of the clauses, supporting decentralized learning. This means that the TM most of the time will operate on outdated voting tallies. We evaluated the proposed parallelization across diverse learning tasks and it turns out that our decentralized TM learning algorithm copes well with working on outdated data, resulting in no significant loss in learning accuracy. Furthermore, we show that the approach provides up to 50 times faster learning. Finally, learning time is almost constant for reasonable clause amounts (employing from 20 to 7,000 clauses on a Tesla V100 GPU). For sufficiently large clause numbers, computation time increases approximately proportionally. Our parallel and asynchronous architecture thus allows processing of more massive datasets and operating with more clauses for higher accuracy