Polynomial Learnability of Semilinear Sets

Abstract

We characterize learnability and non-learnability of subsets of Nm called \u27semilinear sets\u27, with respect to the distribution-free learning model of Valiant. In formal language terms, semilinear sets are exactly the class of \u27letter-counts\u27 (or Parikh-images) of regular sets. We show that the class of semilinear sets of dimensions 1 and 2 is learnable, when the integers are encoded in unary. We complement this result with negative results of several different sorts, relying on hardness assumptions of varying degrees - from P ≠ NP and RP ≠ NP to the hardness of learning DNF. We show that the minimal consistent concept problem is NP-complete for this class, verifying the non-triviality of our learnability result. We also show that with respect to the binary encoding of integers, the corresponding \u27prediction\u27 problem is already as hard as that of DNF, for a class of subsets of Nm much simpler than semilinear sets. The present work represents an interesting class of countably infinite concepts for which the questions of learnability have been nearly completely characterized. In doing so, we demonstrate how various proof techniques developed by Pitt and Valiant [14], Blumer et al. [3], and Pitt and Warmuth [16] can be fruitfully applied in the context of formal languages

    Similar works