Tsetlin machines (TMs) have been successful in several application domains,
operating with high efficiency on Boolean representations of the input data.
However, Booleanizing complex data structures such as sequences, graphs,
images, signal spectra, chemical compounds, and natural language is not
trivial. In this paper, we propose a hypervector (HV) based method for
expressing arbitrarily large sets of concepts associated with any input data.
Using a hyperdimensional space to build vectors drastically expands the
capacity and flexibility of the TM. We demonstrate how images, chemical
compounds, and natural language text are encoded according to the proposed
method, and how the resulting HV-powered TM can achieve significantly higher
accuracy and faster learning on well-known benchmarks. Our results open up a
new research direction for TMs, namely how to expand and exploit the benefits
of operating in hyperspace, including new booleanization strategies,
optimization of TM inference and learning, as well as new TM applications.Comment: 9 pages, 17 figure