Self-Organizing Maps with Variable Input Length for Motif Discovery and Word Segmentation

Abstract

Time Series Motif Discovery (TSMD) is defined as searching for patterns that are previously unknown and appear with a given frequency in time series. Another problem strongly related with TSMD is Word Segmentation. This problem has received much attention from the community that studies early language acquisition in babies and toddlers. The development of biologically plausible models for word segmentation could greatly advance this field. Therefore, in this article, we propose the Variable Input Length Map (VILMAP) for Motif Discovery and Word Segmentation. The model is based on the Self-Organizing Maps and can identify Motifs with different lengths in time series. In our experiments, we show that VILMAP presents good results in finding Motifs in a standard Motif discovery dataset and can avoid catastrophic forgetting when trained with datasets with increasing values of input size. We also show that VILMAP achieves results similar or superior to other methods in the literature developed for the task of word segmentation

    Similar works

    Full text

    thumbnail-image

    Available Versions

    Last time updated on 10/08/2021