A classi er for discrete-valued variable classi cation problems is presented. The system utilizes an information-theoretic algorithm for constructing informative rules from example data. These rules are then used to construct a neural network to perform parallel inference and posterior probability estimation. The network can be `grown ' incrementally, so that new data can be incorporated without repeating the training on previous data. It is shown that this technique performs comparably with other techniques such as back-propagation while having unique advantages in incremental learning capability, training e ciency, knowledge representation, and hardware implementation suitability.