Search CORE

2,504 research outputs found

Modelling the acquisition of syntactic categories

Author: Gobet F
Pine J M
Publication venue: 'Informa UK Limited'
Publication date: 01/01/1997
Field of study

This research represents an attempt to model the child’s acquisition of syntactic categories. A computational model, based on the EPAM theory of perception and learning, is developed. The basic assumptions are that (1) syntactic categories are actively constructed by the child using distributional learning abilities; and (2) cognitive constraints in learning rate and memory capacity limit these learning abilities. We present simulations of the syntax acquisition of a single subject, where the model learns to build up multi-word utterances by scanning a sample of the speech addressed to the subject by his mother

CiteSeerX

Brunel University Research Archive

Recommended from our members

Simulating the referential properties of Dutch, German and English Root Infinitives in MOSAIC

Author: Freudenthal D
Gobet F
Pine JM
Publication venue: Taylor and Francis
Publication date: 01/01/2008
Field of study

Children learning many languages go through an Optional Infinitive stage in which they produce non-finite verb forms in contexts in which a finite verb form is required (e.g. ‘That go there’ instead of ‘That goes there’). MOSAIC (Model of Syntax Acquisition in Children) is a computational model of language learning that successfully simulates the developmental patterning of the Optional Infinitive (OI) phenomenon in English, Dutch, German and Spanish (Freudenthal, Pine, Aguado-Orea & Gobet, 2007). In the present study, MOSAIC is applied to the simulation of certain subtle but theoretically important phenomena in the cross-linguistic patterning of the OI phenomenon that are typically assumed to require a more complex formal analysis. MOSAIC is shown to successfully simulate 1) The Modal Reference Effect: the finding that Dutch and German children tend to use Root Infinitives in modal contexts, 2) The Eventivity constraint: the finding that Dutch and German Root Infinitives refer predominantly to actions rather than static situations, and 3) The absence or reduced size of these effects in English. These results provide strong support for input-driven explanations of the Modal Reference Effect as well as MOSAIC’s mechanism for producing Root Infinitives, and the wider claim that it is possible to explain key aspects of children’s early multi-word speech in terms of the interaction between a resource-limited distributional learning mechanism and the surface properties of the language to which children are exposed

Brunel University Research Archive

Recommended from our members

Modelling vocabulary acquisition: an explanation of the link between the phonological loop and long-term memory

Author: Gobet F
Jones G
Pine JM
Publication venue: Society for the Study of Artificial Intelligence and Simulation of Behaviour
Publication date: 01/01/2005
Field of study

Nottingham Trent Institutional Repository (IRep)

On the resolution of ambiguities in the extraction of syntactic categories through chunking

Author: Freudenthal D
Gobet F
Pine JM
Publication venue: 'Elsevier BV'
Publication date: 01/01/2005
Field of study

In recent years, several authors have investigated how co-occurrence statistics in natural language can act as a cue that children may use to extract syntactic categories for the language they are learning. While some authors have reported encouraging results, it is difficult to evaluate the quality of the syntactic categories derived. It is argued in this paper that traditional measures of accuracy are inherently flawed. A valid evaluation metric needs to consider the wellformedness of utterances generated through a production end. This paper attempts to evaluate the quality of the categories derived from co-occurrence statistics through the use of MOSAIC, a computational model of syntax acquisition that has already been used to simulate several phenomena in child language. It is shown that derived syntactic categories that may appear to be of high quality quickly give rise to errors that are not typical of child speech. A solution to this problem is suggested in the form of a chunking mechanism that serves to differentiate between alternative grammatical functions of identical word forms. Results are evaluated in terms of the error rates in utterances produced by the system as well as the quantitative fit to the phenomenon of subject omission

CiteSeerX

Brunel University Research Archive

Recommended from our members

Modeling the optional infinite stage in MOSAIC: A generalization to Dutch

Author: Freudenthal D
Gobet F
Pine J M
Publication venue: 'Informa UK Limited'
Publication date: 01/01/2001
Field of study

This paper presents a model of a stage in children’s language development known as the optional infinitive stage. The model was originally developed for English, where it was shown to provide a good account of several phenomena. The model, which uses a discrimination network, analyzes the distribution of words in the input, and derives word classes from them by linking words that are used in a similar context. While the earlier version of the model is sensitive only to characteristics of phrases that follow target words, the present version also takes preceding input into consideration. Also, the present version uses a probabilistic rather than a deterministic learning mechanism. Generalisation of the model to Dutch is considered a strong test of the model, since Dutch displays the optional infinitive phenomenon, while its syntax differs substantially from that of English. The model was presented with child-directed input from two Dutch mothers, and its output was compared to that of the respective children. Despite the fact that the model was developed for a different language, it captures the optional infinitive phenomenon in Dutch as it does in English, while showing sensitivity to Dutch syntax. These results suggest that a simple distributional analyzer can capture the regularities of different languages despite the apparent differences in their syntax

Brunel University Research Archive

Recommended from our members

Meter based omission of function words in MOSAIC

Author: Freudenthal D
Gobet F
Pine J M
Publication venue: 'Energy Psychology Press'
Publication date: 01/01/2007
Field of study

MOSAIC (Model of Syntax Acquisition in Children) is augmented with a new mechanism that allows for the omission of unstressed function words based on the prosodic structure of the utterance in which they occur. The mechanism allows MOSAIC to omit elements from multiple locations in a target utterance, which it was previously unable to do. It is shown that, although the new mechanism results in Optional Infinitive errors when run on children’s input, it is insufficient to simulate the high rate OI errors in children’s speech unless combined with MOSAIC’s edge-first learning mechanism. It is also shown that the addition of the new mechanism does not adversely affect MOSAIC’s fit to the Optional Infinitive phenomenon. The mechanism does, however, make MOSAIC’s output more child-like, both in terms of the range of utterances it can simulate, and the level and type of determiner omission that the model displays

Brunel University Research Archive

Simulating the temporal reference of Dutch and English Root Infinitives.

Author: Freudenthal D
Gobet F
Pine J M
Publication venue: Cognitive Science Society
Publication date: 01/01/2004
Field of study

Hoekstra & Hyams (1998) claim that the overwhelming majority of Dutch children’s Root Infinitives (RIs) are used to refer to modal (not realised) events, whereas in English speaking children, the temporal reference of RIs is free. Hoekstra & Hyams attribute this difference to qualitative differences in how temporal reference is carried by the Dutch infinitive and the English bare form. Ingram & Thompson (1996) advocate an input-driven account of this difference and suggest that the modal reading of German (and Dutch) RIs is caused by the fact that infinitive forms are predominantly used in modal contexts. This paper investigates whether an input-driven account can explain the differential reading of RIs in Dutch and English. To this end, corpora of English and Dutch Child Directed Speech were fed through MOSAIC, a computational model that has already been used to simulate the basic Optional Infinitive phenomenon. Infinitive forms in the input were tagged for modal or non-modal reference based on the sentential context in which they appeared. The output of the model was compared to the results of corpus studies and recent experimental data which call into question the strict distinction between Dutch and English advocated by Hoekstra & Hyams

CiteSeerX

eScholarship - University of California

Brunel University Research Archive

Modelling children's negation errors using probabilistic learning in MOSAIC.

Author: Croker S
Gobet F
Pine J M
Publication venue: Proceedings of the Fifth International Conference on Cognitive Modeling
Publication date: 01/01/2003
Field of study

Cognitive models of language development have often been used to simulate the pattern of errors in children’s speech. One relatively infrequent error in English involves placing inflection to the right of a negative, rather than to the left. The pattern of negation errors in English is explained by Harris & Wexler (1996) in terms of very early knowledge of inflection on the part of the child. We present data from three children which demonstrates that although negation errors are rare, error types predicted not to occur by Harris & Wexler do occur, as well as error types that are predicted to occur. Data from MOSAIC, a model of language acquisition, is also presented. MOSAIC is able to simulate the pattern of negation errors in children’s speech. The phenomenon is modelled more accurately when a probabilistic learning algorithm is used

CiteSeerX

Brunel University Research Archive

Recommended from our members

Simulating the Noun-Verb Asymmetry in the Productivity of Children’s Speech

Author: Freudenthal D
Gobet F
Pine J M
Publication venue: 'Energy Psychology Press'
Publication date: 01/01/2007
Field of study

Several authors propose that children may acquire syntactic categories on the basis of co-occurrence statistics of words in the input. This paper assesses the relative merits of two such accounts by assessing the type and amount of productive language that results from computing co-occurrence statistics over conjoint and independent preceding and following contexts. This is achieved through the implementation of these methods in MOSAIC, a computational model of syntax acquisition that produces utterances that can be directly compared to child speech, and has a developmental component (i.e. produces increasingly long utterances). It is shown that the computation of co-occurrence statistics over conjoint contexts or frames results in a pattern of productive speech that more closely resembles that displayed by language learning children. The simulation of the developmental patterning of children’s productive speech furthermore suggests two refinements to this basic mechanism: inclusion of utterance boundaries, and the weighting of frames for their lexical content

Brunel University Research Archive

Recommended from our members

Subject omission in children's language; The case for performance limitations in learning.

Author: Freudenthal D
Gobet F
Pine J M
Publication venue: Cognitive Science Society
Publication date: 01/01/2002
Field of study

Several theories have been put forward to explain the phenomenon that children who are learning to speak their native language tend to omit the subject of the sentence. According to the pro-drop hypothesis, children represent the wrong grammar. According to the performance limitations view, children represent the full grammar, but omit subjects due to performance limitations in production. This paper proposes a third explanation and presents a model which simulates the data relevant to subject omission. The model consists of a simple learning mechanism that carries out a distributional analysis of naturalistic input. It does not have any overt representation of grammatical categories, and its performance limitations reside mainly in its learning mechanism. The model clearly simulates the data at hand, without the need to assume large amounts of innate knowledge in the child, and can be considered more parsimonious on these grounds alone. Importantly, it employs a unified and objective measure of processing load, namely the length of the utterance, which interacts with frequency in the input. The standard performance limitations view assumes that processing load is dependent on a phrase’s syntactic role, but does not specify a unifying underlying principle

eScholarship - University of California

Brunel University Research Archive