438 research outputs found
On the resolution of ambiguities in the extraction of syntactic categories through chunking
In recent years, several authors have investigated how co-occurrence statistics in natural language can act as a cue
that children may use to extract syntactic categories for the language they are learning. While some authors have reported encouraging results, it is difficult to evaluate the quality of the syntactic categories derived. It is argued in this paper that traditional measures of accuracy are inherently flawed. A valid evaluation metric needs to consider the wellformedness of utterances generated through a production end. This paper attempts to evaluate the quality of the categories derived from co-occurrence statistics through the use of MOSAIC, a computational model of syntax acquisition
that has already been used to simulate several phenomena in child language. It is shown that derived syntactic categories that may appear to be of high quality quickly give rise to errors that are not typical of child speech. A solution to this problem is suggested in the form of a chunking mechanism that serves to differentiate between alternative grammatical functions of identical word forms. Results are evaluated in terms of the error rates in utterances produced
by the system as well as the quantitative fit to the phenomenon of subject omission
A pattern-recognition theory of search in expert problem solving
Understanding how look-ahead search and pattern recognition interact is one of the important research questions in the study of expert problem-solving. This paper examines the implications of the template theory (Gobet & Simon, 1996a), a recent theory of expert memory, on the theory of problem solving in chess. Templates are "chunks" (Chase & Simon, 1973) that have evolved into more complex data structures and that possess slots allowing values to be encoded rapidly. Templates may facilitate search in three ways: (a) by allowing information to be stored into LTM rapidly; (b) by allowing a search in the template space in addition to a search in the move space; and (c) by compensating loss in the "mind's eye" due to interference and decay. A computer model implementing the main ideas of the theory is presented, and simulations of its search behaviour are discussed. The template theory accounts for the slight skill difference in average depth of search found in chess players, as well as for other empirical data
Recommended from our members
Simulating the referential properties of Dutch, German and English Root Infinitives in MOSAIC
Children learning many languages go through an Optional Infinitive stage in which they produce non-finite verb forms in contexts in which a finite verb form is required (e.g. āThat go thereā instead of āThat goes thereā). MOSAIC (Model of Syntax Acquisition in Children) is a computational model of language learning that successfully simulates the developmental patterning of the Optional Infinitive (OI) phenomenon in English, Dutch, German and Spanish (Freudenthal, Pine, Aguado-Orea & Gobet, 2007). In the present study, MOSAIC is applied to the simulation of certain subtle but theoretically important phenomena in the cross-linguistic patterning of the OI phenomenon that are typically assumed to require a more complex formal analysis. MOSAIC is shown to successfully simulate 1) The Modal Reference Effect: the finding that Dutch and German children tend to use Root Infinitives in modal contexts, 2) The Eventivity constraint: the finding that Dutch and German Root Infinitives refer predominantly to actions rather than static situations, and 3) The absence or reduced size of these effects in English. These results provide strong support for input-driven explanations of the Modal Reference Effect as well as MOSAICās mechanism for producing Root Infinitives, and the wider claim that it is possible to explain key aspects of childrenās early multi-word speech in terms of the interaction between a resource-limited distributional learning mechanism and the surface properties of the language to which children are exposed
Attention mechanisms in the CHREST cognitive architecture
In this paper, we describe the attention mechanisms in CHREST, a computational architecture of human visual expertise. CHREST organises information acquired by direct experience from the world in the form of chunks. These chunks are searched for, and verified, by a unique set of heuristics, comprising the attention mechanism. We explain how the attention mechanism combines bottom-up and top-down heuristics from internal and external sources of information. We describe some experimental evidence demonstrating the correspondence of CHRESTās perceptual mechanisms with those of human subjects. Finally, we discuss how visual attention can play an important role in actions carried out by human experts in domains such as chess
Recommended from our members
Meter based omission of function words in MOSAIC
MOSAIC (Model of Syntax Acquisition in Children) is augmented with a new mechanism that allows for the omission of unstressed function words based on the prosodic structure of the utterance in which they occur. The mechanism allows MOSAIC to omit elements from multiple locations in a target utterance, which it was previously unable to do. It is shown that, although the new mechanism results in Optional Infinitive errors when run on childrenās input, it is insufficient to simulate the high rate OI errors in childrenās speech unless combined with MOSAICās edge-first learning mechanism. It is also shown that the addition of the new mechanism does not adversely affect MOSAICās fit to the Optional Infinitive phenomenon. The mechanism does, however, make MOSAICās output more child-like, both in terms of the range of utterances it can simulate, and the level and type of determiner omission that the model displays
Simulating the temporal reference of Dutch and English Root Infinitives.
Hoekstra & Hyams (1998) claim that the overwhelming majority of Dutch childrenās Root Infinitives (RIs) are used to refer to modal (not realised) events, whereas in English speaking children, the temporal reference of RIs is free. Hoekstra & Hyams attribute this difference to qualitative differences in how temporal reference is carried by the Dutch infinitive and the English bare form. Ingram & Thompson (1996) advocate an input-driven account of this difference and suggest that the modal reading of German (and Dutch) RIs is caused by the fact that infinitive forms are predominantly used in modal contexts. This paper investigates whether an input-driven account can explain the differential reading of RIs in Dutch and English. To this end, corpora of English and Dutch Child Directed Speech were fed through MOSAIC, a computational model that has already been used to simulate the basic Optional Infinitive phenomenon. Infinitive forms in the input were tagged for modal or non-modal reference based on the sentential context in which they appeared. The output of the model was compared to the results of corpus studies and recent experimental data which call into question the strict distinction between Dutch and English advocated by Hoekstra & Hyams
Recommended from our members
Simulating the Noun-Verb Asymmetry in the Productivity of Childrenās Speech
Several authors propose that children may acquire syntactic categories on the basis of co-occurrence statistics of words in the input. This paper assesses the relative merits of two such accounts by assessing the type and amount of productive language that results from computing co-occurrence statistics over conjoint and independent preceding and following contexts. This is achieved through the implementation of these methods in MOSAIC, a computational model of syntax acquisition that produces utterances that can be directly compared to child speech, and has a developmental component (i.e. produces increasingly long utterances). It is shown that the computation of co-occurrence statistics over conjoint contexts or frames results in a pattern of productive speech that more closely resembles that displayed by language learning children. The simulation of the developmental patterning of childrenās productive speech furthermore suggests two refinements to this basic mechanism: inclusion of utterance boundaries, and the weighting of frames for their lexical content
Recommended from our members
Subject omission in children's language; The case for performance limitations in learning.
Several theories have been put forward to explain the phenomenon that children who are learning to speak their native language tend to omit the subject of the sentence. According to the pro-drop hypothesis, children represent the wrong grammar. According to the performance limitations view, children represent the full grammar, but omit subjects due to performance limitations in production. This paper proposes a third explanation and presents a model which simulates the data relevant to subject omission. The model consists of a simple learning mechanism that carries out a distributional analysis of naturalistic input. It does not have any overt representation of grammatical categories, and its performance limitations reside mainly in its learning mechanism. The model clearly simulates the data at hand, without the need to assume large amounts of innate knowledge in the child, and can be considered more parsimonious on these grounds alone. Importantly, it employs a unified and objective measure of processing load, namely the length of the utterance, which interacts with frequency in the input. The standard performance limitations view assumes that processing load is dependent on a phraseās syntactic role, but does not specify a unifying underlying principle
Recommended from our members
Unifying cross-linguistic and within-language patterns of finiteness marking in MOSAIC
MOSAIC, a model that has already simulated cross-linguistic differences in the occurrence of the Optional Infinitive phenomenon, is applied to the simulation of the pattern of finiteness marking within Dutch. This within-language pattern, which includes verb placement, low rates of Optional Infinitives in Wh-questions and the correlation between finiteness marking and subject provision, has been taken as evidence for the view that children have correctly set the clause structure and inflectional parameters for their language. MOSAIC, which employs no built-in linguistic knowledge, clearly simulates the pattern of results as a function of its utterance-final bias, the same mechanism that is responsible for its successful simulation of the crosslinguistic data. These results suggest that both the crosslinguistic and withinālanguage pattern of finiteness marking can be understood in terms of the interaction between a simple resource-limited learning mechanism and the distributional statistics of the input to which it is exposed. Thus, these phenomena do not provide any evidence for abstract or innate knowledge on the part of the child
Recommended from our members
The role of input size and generativity in simulating language acquisition.
This paper presents an analysis of the role of input size and generativity (ability to produce novel utterances) in simulating developmental data on a phenomenon in first language acquisition. An existing model that has already simulated the basic phenomenon is trained on input sets of varying sizes (13,000 to 40,000 utterances). The ability of the model to produce novel utterances is also manipulated. Both input size and generativity affect the fits for later stages of development. Higher generativity improves fits for later stages, but worsens them for early stages, suggesting generativity is best increased as a function of mean length of utterance (MLU). The effect of training set is variable. Results are discussed in terms of optimal training sets for simulations, and childrenās developing ability to produce utterances beyond the input they have heard
- ā¦