Structure Here, Bias There: Hierarchical Generalization by Jointly Learning Syntactic Transformations

Abstract

When learning syntactic transformations, children consistently induce structure-dependent generalizations, even though the primary linguistic data may be consistent with both linear and hierarchical rules. What is the source of this inductive bias? In this paper, we use computational models to investigate the hypothesis that evidence for the structure-sensitivity of one syntactic transformation can bias the acquisition of another transformation in favor of a hierarchical rule. We train sequence-to-sequence models based on artificial neural networks to learn multiple syntactic transformations at the same time in a fragment of English; we hold out cases that disambiguate linear and hierarchical rules for one of those transformations, and then test for hierarchical generalization to these held-out sentence types. Consistent with our hypothesis, we find that multitask learning induces a hierarchical bias for certain combinations of tasks, and that this bias is stronger for transformations that share computational building blocks. At the same time, the bias is in general insufficient to lead the learner to categorically acquire the hierarchical generalization for the target transformation

    Similar works