1 research outputs found

    Extraction for Frequent Sequential Patterns with Minimum Varaible-Wildcard Regions

    No full text
    Abstract A new methodology for extracting all frequent sequential patterns with minimum variable-length wildcard regions from sequence databases in order to extract candidates of a motif from amino acid sequences is proposed. A scope database defined by the k-length pattern consists of not only the projected database including the start position of a scan but also the range of the scan and occurrences corresponding to evidence for the pattern. The scope database makes it possible to avoid the construction of the variable-length wildcard region that is too large to explain occurrences corresponding to evidence for each (k+1)-length pattern. Moreover, redundancy is also eliminated for the set of solutions using the scope database. Furthermore, the prototype has been applied to the evaluation of a dataset that includes the Leucine Zipper motif. Our method resulted in a high capability to extract non-redundant sequential patterns including minimum variable-wildcard regions
    corecore