112 research outputs found

    Learning Functional Prepositions

    Full text link
    In first language acquisition, what does it mean for a grammatical category to have been acquired, and what are the mechanisms by which children learn functional categories in general? In the context of prepositions (Ps), if the lexical/functional divide cuts through the P category, as has been suggested in the theoretical literature, then constructivist accounts of language acquisition would predict that children develop adult-like competence with the more abstract units, functional Ps, at a slower rate compared to their acquisition of lexical Ps. Nativists instead assume that the features of functional P are made available by Universal Grammar (UG), and are mapped as quickly, if not faster, than the semantic features of their lexical counterparts. Conversely, if Ps are either all lexical or all functional, on both accounts of acquisition we should observe few differences in learning. Three empirical studies of the development of P were conducted via computer analysis of the English and Spanish sub-corpora of the CHILDES database. Study 1 analyzed errors in child usage of Ps, finding almost no errors in commission in either language, but that the English learners lag in their production of functional Ps relative to lexical Ps. That no such delay was found in the Spanish data suggests that the English pattern is not universal. Studies 2 and 3 applied novel measures of phrasal (P head + nominal complement) productivity to the data. Study 2 examined prepositional phrases (PPs) whose head-complement pairs appeared in both child and adult speech, while Study 3 considered PPs produced by children that never occurred in adult speech. In both studies the productivity of Ps for English children developed faster than that of lexical Ps. In Spanish there were few differences, suggesting that children had already mastered both orders of Ps early in acquisition. These empirical results suggest that at least in English P is indeed a split category, and that children acquire the syntax of the functional subset very quickly, committing almost no errors. The UG position is thus supported. Next, the dissertation investigates a \u27soft nativist\u27 acquisition strategy that composes the distributional analysis of input, minimal a priori knowledge of the possible co-occurrence of morphosyntactic features associated with functional elements, and linguistic knowledge that is presumably acquired via the experience of pragmatic, communicative situations. The output of the analysis consists in a mapping of morphemes to the feature bundles of nominative pronouns for English and Spanish, plus specific claims about the sort of knowledge required from experience. The acquisition model is then extended to adpositions, to examine what, if anything, distributional analysis can tell us about the functional sequences of PPs. The results confirm the theoretical position according to which spatiotemporal Ps are lexical in character, rooting their own extended projections, and that functional Ps express an aspectual sequence in the functional superstructure of the PP

    Wide-coverage parsing for Turkish

    Get PDF
    Wide-coverage parsing is an area that attracts much attention in natural language processing research. This is due to the fact that it is the first step tomany other applications in natural language understanding, such as question answering. Supervised learning using human-labelled data is currently the best performing method. Therefore, there is great demand for annotated data. However, human annotation is very expensive and always, the amount of annotated data is much less than is needed to train well-performing parsers. This is the motivation behind making the best use of data available. Turkish presents a challenge both because syntactically annotated Turkish data is relatively small and Turkish is highly agglutinative, hence unusually sparse at the whole word level. METU-Sabancı Treebank is a dependency treebank of 5620 sentences with surface dependency relations and morphological analyses for words. We show that including even the crudest forms of morphological information extracted from the data boosts the performance of both generative and discriminative parsers, contrary to received opinion concerning English. We induce word-based and morpheme-based CCG grammars from Turkish dependency treebank. We use these grammars to train a state-of-the-art CCG parser that predicts long-distance dependencies in addition to the ones that other parsers are capable of predicting. We also use the correct CCG categories as simple features in a graph-based dependency parser and show that this improves the parsing results. We show that a morpheme-based CCG lexicon for Turkish is able to solve many problems such as conflicts of semantic scope, recovering long-range dependencies, and obtaining smoother statistics from the models. CCG handles linguistic phenomena i.e. local and long-range dependencies more naturally and effectively than other linguistic theories while potentially supporting semantic interpretation in parallel. Using morphological information and a morpheme-cluster based lexicon improve the performance both quantitatively and qualitatively for Turkish. We also provide an improved version of the treebank which will be released by kind permission of METU and Sabancı

    Statistical Knowledge and Learning in Phonology

    Get PDF
    This thesis deals with the theory of the phonetic component of grammar in a formal probabilistic inference framework: (1) it has been recognized since the beginning of generative phonology that some language-specific phonetic implementation is actually context-dependent, and thus it can be said that there are gradient "phonetic processes" in grammar in addition to categorical "phonological processes." However, no explicit theory has been developed to characterize these processes. Meanwhile, (2) it is understood that language acquisition and perception are both really informed guesswork: the result of both types of inference can be reasonably thought to be a less-than-perfect committment, with multiple candidate grammars or parses considered and each associated with some degree of credence. Previous research has used probability theory to formalize these inferences in implemented computational models, especially in phonetics and phonology. In this role, computational models serve to demonstrate the existence of working learning/per- ception/parsing systems assuming a faithful implementation of one particular theory of human language, and are not intended to adjudicate whether that theory is correct. The current thesis (1) develops a theory of the phonetic component of grammar and how it relates to the greater phonological system and (2) uses a formal Bayesian treatment of learning to evaluate this theory of the phonological architecture and for making predictions about how the resulting grammars will be organized. The coarse description of the consequence for linguistic theory is that the processes we think of as "allophonic" are actually language-specific, gradient phonetic processes, assigned to the phonetic component of grammar; strict allophones have no representation in the output of the categorical phonological grammar

    Formal Linguistic Models and Knowledge Processing. A Structuralist Approach to Rule-Based Ontology Learning and Population

    Get PDF
    2013 - 2014The main aim of this research is to propose a structuralist approach for knowledge processing by means of ontology learning and population, achieved starting from unstructured and structured texts. The method suggested includes distributional semantic approaches and NL formalization theories, in order to develop a framework, which relies upon deep linguistic analysis... [edited by author]XIII n.s

    Head-Driven Phrase Structure Grammar

    Get PDF
    Head-Driven Phrase Structure Grammar (HPSG) is a constraint-based or declarative approach to linguistic knowledge, which analyses all descriptive levels (phonology, morphology, syntax, semantics, pragmatics) with feature value pairs, structure sharing, and relational constraints. In syntax it assumes that expressions have a single relatively simple constituent structure. This volume provides a state-of-the-art introduction to the framework. Various chapters discuss basic assumptions and formal foundations, describe the evolution of the framework, and go into the details of the main syntactic phenomena. Further chapters are devoted to non-syntactic levels of description. The book also considers related fields and research areas (gesture, sign languages, computational linguistics) and includes chapters comparing HPSG with other frameworks (Lexical Functional Grammar, Categorial Grammar, Construction Grammar, Dependency Grammar, and Minimalism)

    Topological Foundations of Cognitive Science

    Get PDF
    A collection of papers presented at the First International Summer Institute in Cognitive Science, University at Buffalo, July 1994, including the following papers: ** Topological Foundations of Cognitive Science, Barry Smith ** The Bounds of Axiomatisation, Graham White ** Rethinking Boundaries, Wojciech Zelaniec ** Sheaf Mereology and Space Cognition, Jean Petitot ** A Mereotopological Definition of 'Point', Carola Eschenbach ** Discreteness, Finiteness, and the Structure of Topological Spaces, Christopher Habel ** Mass Reference and the Geometry of Solids, Almerindo E. Ojeda ** Defining a 'Doughnut' Made Difficult, N .M. Gotts ** A Theory of Spatial Regions with Indeterminate Boundaries, A.G. Cohn and N.M. Gotts ** Mereotopological Construction of Time from Events, Fabio Pianesi and Achille C. Varzi ** Computational Mereology: A Study of Part-of Relations for Multi-media Indexing, Wlodek Zadrozny and Michelle Ki
    corecore