6 research outputs found
Recommended from our members
A Quantitative Framework for Specifying Underlying Representations in Child Language Acquisition
My research broadly demonstrates how quantitative approaches can be effectively leveraged for developmental research. In this dissertation, I show one quantitatively precise way to identify the nature of developing mental representations in a variety of domains; my approach utilizes the connection between a learners input, creation of a potential mental representation from that input, and evaluation with respect to the learners output. More specifically, the quantitative approach I use leverages both realistic input data and realistic output data as part of the model design and evaluation. Using modeling, we have the opportunity to concretely evaluate representational options that we would not otherwise be able to disambiguate. I demonstrate this quantitative approach with three case studies in language development: (I) the development of adjective ordering preferences, where I find that the representations that adults use to talk to children are different than the ones used to talk to other, adults, (II) immature individual syntactic category representations, where I identify precisely which immature category representation young children are likely to be using, and (III) the development of adult productive syntactic category representations, where I identify when adult category knowledge emerges in typically and atypically developing populations
Learning Functional Prepositions
In first language acquisition, what does it mean for a grammatical category to have been acquired, and what are the mechanisms by which children learn functional categories in general? In the context of prepositions (Ps), if the lexical/functional divide cuts through the P category, as has been suggested in the theoretical literature, then constructivist accounts of language acquisition would predict that children develop adult-like competence with the more abstract units, functional Ps, at a slower rate compared to their acquisition of lexical Ps. Nativists instead assume that the features of functional P are made available by Universal Grammar (UG), and are mapped as quickly, if not faster, than the semantic features of their lexical counterparts. Conversely, if Ps are either all lexical or all functional, on both accounts of acquisition we should observe few differences in learning.
Three empirical studies of the development of P were conducted via computer analysis of the English and Spanish sub-corpora of the CHILDES database. Study 1 analyzed errors in child usage of Ps, finding almost no errors in commission in either language, but that the English learners lag in their production of functional Ps relative to lexical Ps. That no such delay was found in the Spanish data suggests that the English pattern is not universal. Studies 2 and 3 applied novel measures of phrasal (P head + nominal complement) productivity to the data. Study 2 examined prepositional phrases (PPs) whose head-complement pairs appeared in both child and adult speech, while Study 3 considered PPs produced by children that never occurred in adult speech. In both studies the productivity of Ps for English children developed faster than that of lexical Ps. In Spanish there were few differences, suggesting that children had already mastered both orders of Ps early in acquisition. These empirical results suggest that at least in English P is indeed a split category, and that children acquire the syntax of the functional subset very quickly, committing almost no errors. The UG position is thus supported.
Next, the dissertation investigates a \u27soft nativist\u27 acquisition strategy that composes the distributional analysis of input, minimal a priori knowledge of the possible co-occurrence of morphosyntactic features associated with functional elements, and linguistic knowledge that is presumably acquired via the experience of pragmatic, communicative situations. The output of the analysis consists in a mapping of morphemes to the feature bundles of nominative pronouns for English and Spanish, plus specific claims about the sort of knowledge required from experience.
The acquisition model is then extended to adpositions, to examine what, if anything, distributional analysis can tell us about the functional sequences of PPs. The results confirm the theoretical position according to which spatiotemporal Ps are lexical in character, rooting their own extended projections, and that functional Ps express an aspectual sequence in the functional superstructure of the PP
Automatic grammar induction from free text using insights from cognitive grammar
Automatic identification of the grammatical structure of a sentence is useful in many Natural Language
Processing (NLP) applications such as Document Summarisation, Question Answering systems and
Machine Translation. With the availability of syntactic treebanks, supervised parsers have been
developed successfully for many major languages. However, for low-resourced minority languages with
fewer digital resources, this poses more of a challenge. Moreover, there are a number of syntactic
annotation schemes motivated by different linguistic theories and formalisms which are sometimes
language specific and they cannot always be adapted for developing syntactic parsers across different
language families.
This project aims to develop a linguistically motivated approach to the automatic induction of
grammatical structures from raw sentences. Such an approach can be readily adapted to different
languages including low-resourced minority languages. We draw the basic approach to linguistic analysis
from usage-based, functional theories of grammar such as Cognitive Grammar, Computational Paninian
Grammar and insights from psycholinguistic studies. Our approach identifies grammatical structure of a
sentence by recognising domain-independent, general, cognitive patterns of conceptual organisation
that occur in natural language. It also reflects some of the general psycholinguistic properties of parsing
by humans - such as incrementality, connectedness and expectation.
Our implementation has three components: Schema Definition, Schema Assembly and Schema
Prediction. Schema Definition and Schema Assembly components were implemented algorithmically as
a dictionary and rules. An Artificial Neural Network was trained for Schema Prediction. By using Parts of
Speech tags to bootstrap the simplest case of token level schema definitions, a sentence is passed
through all the three components incrementally until all the words are exhausted and the entire
sentence is analysed as an instance of one final construction schema. The order in which all intermediate
schemas are assembled to form the final schema can be viewed as the parse of the sentence. Parsers
for English and Welsh (a low-resource minority language) were developed using the same approach with
some changes to the Schema Definition component. We evaluated the parser performance by (a)
Quantitative evaluation by comparing the parsed chunks against the constituents in a phrase structure
tree (b) Manual evaluation by listing the range of linguistic constructions covered by the parser and by
performing error analysis on the parser outputs (c) Evaluation by identifying the number of edits
required for a correct assembly (d) Qualitative evaluation based on Likert scales in online surveys
Constructions emerging : a usage-based model of the acquisition of grammar
This dissertation is concerned with the development of grammar.
Starting from a
usage-based perspective, which holds that children use domain-general
learning mechanisms to acquire the grammatical patterns of their mother
tongue, Beekhuizen shows how to operationalize various concepts from
this tradition in a computational
model. In order to arrive at a sound set of assumptions, Beekhuizen
compares and criticizes various earlier usage-based modeling approaches
and scrutinizes the concepts of a usage-based theory of language
acquisition from the perspective of a
computational modeler. As the model should be able to produce utterances
on the basis of a meaning to be expressed, as well as to interpret
utterances, the availability of meaning from the situational context is
studied empirically. The resulting model, the Syntagmatic-Paradigmatic
Learner, simulates an increasing ability to understand utterances on the
basis of a grammar of constructions, as well as to produce utterances
on the basis of this grammar. Several developmental effects are
simulated and the internal states of the model are carefully examined.NWO (grant 322.70.001)Language Use in Past and Presen