Search CORE

112 research outputs found

Learning Functional Prepositions

Author: Stewart John
Publication venue: CUNY Academic Works
Publication date: 30/09/2015
Field of study

In first language acquisition, what does it mean for a grammatical category to have been acquired, and what are the mechanisms by which children learn functional categories in general? In the context of prepositions (Ps), if the lexical/functional divide cuts through the P category, as has been suggested in the theoretical literature, then constructivist accounts of language acquisition would predict that children develop adult-like competence with the more abstract units, functional Ps, at a slower rate compared to their acquisition of lexical Ps. Nativists instead assume that the features of functional P are made available by Universal Grammar (UG), and are mapped as quickly, if not faster, than the semantic features of their lexical counterparts. Conversely, if Ps are either all lexical or all functional, on both accounts of acquisition we should observe few differences in learning. Three empirical studies of the development of P were conducted via computer analysis of the English and Spanish sub-corpora of the CHILDES database. Study 1 analyzed errors in child usage of Ps, finding almost no errors in commission in either language, but that the English learners lag in their production of functional Ps relative to lexical Ps. That no such delay was found in the Spanish data suggests that the English pattern is not universal. Studies 2 and 3 applied novel measures of phrasal (P head + nominal complement) productivity to the data. Study 2 examined prepositional phrases (PPs) whose head-complement pairs appeared in both child and adult speech, while Study 3 considered PPs produced by children that never occurred in adult speech. In both studies the productivity of Ps for English children developed faster than that of lexical Ps. In Spanish there were few differences, suggesting that children had already mastered both orders of Ps early in acquisition. These empirical results suggest that at least in English P is indeed a split category, and that children acquire the syntax of the functional subset very quickly, committing almost no errors. The UG position is thus supported. Next, the dissertation investigates a \u27soft nativist\u27 acquisition strategy that composes the distributional analysis of input, minimal a priori knowledge of the possible co-occurrence of morphosyntactic features associated with functional elements, and linguistic knowledge that is presumably acquired via the experience of pragmatic, communicative situations. The output of the analysis consists in a mapping of morphemes to the feature bundles of nominative pronouns for English and Spanish, plus specific claims about the sort of knowledge required from experience. The acquisition model is then extended to adpositions, to examine what, if anything, distributional analysis can tell us about the functional sequences of PPs. The results confirm the theoretical position according to which spatiotemporal Ps are lexical in character, rooting their own extended projections, and that functional Ps express an aspectual sequence in the functional superstructure of the PP

City University of New York

Wide-coverage parsing for Turkish

Author: Çakici Ruket
Publication venue: The University of Edinburgh
Publication date: 01/01/2009
Field of study

Wide-coverage parsing is an area that attracts much attention in natural language processing research. This is due to the fact that it is the first step tomany other applications in natural language understanding, such as question answering. Supervised learning using human-labelled data is currently the best performing method. Therefore, there is great demand for annotated data. However, human annotation is very expensive and always, the amount of annotated data is much less than is needed to train well-performing parsers. This is the motivation behind making the best use of data available. Turkish presents a challenge both because syntactically annotated Turkish data is relatively small and Turkish is highly agglutinative, hence unusually sparse at the whole word level. METU-Sabancı Treebank is a dependency treebank of 5620 sentences with surface dependency relations and morphological analyses for words. We show that including even the crudest forms of morphological information extracted from the data boosts the performance of both generative and discriminative parsers, contrary to received opinion concerning English. We induce word-based and morpheme-based CCG grammars from Turkish dependency treebank. We use these grammars to train a state-of-the-art CCG parser that predicts long-distance dependencies in addition to the ones that other parsers are capable of predicting. We also use the correct CCG categories as simple features in a graph-based dependency parser and show that this improves the parsing results. We show that a morpheme-based CCG lexicon for Turkish is able to solve many problems such as conflicts of semantic scope, recovering long-range dependencies, and obtaining smoother statistics from the models. CCG handles linguistic phenomena i.e. local and long-range dependencies more naturally and effectively than other linguistic theories while potentially supporting semantic interpretation in parallel. Using morphological information and a morpheme-cluster based lexicon improve the performance both quantitatively and qualitatively for Turkish. We also provide an improved version of the treebank which will be released by kind permission of METU and Sabancı

Edinburgh Research Archive

Statistical Knowledge and Learning in Phonology

Author: Dunbar Ewan
Publication venue
Publication date: 01/01/2013
Field of study

This thesis deals with the theory of the phonetic component of grammar in a formal probabilistic inference framework: (1) it has been recognized since the beginning of generative phonology that some language-specific phonetic implementation is actually context-dependent, and thus it can be said that there are gradient "phonetic processes" in grammar in addition to categorical "phonological processes." However, no explicit theory has been developed to characterize these processes. Meanwhile, (2) it is understood that language acquisition and perception are both really informed guesswork: the result of both types of inference can be reasonably thought to be a less-than-perfect committment, with multiple candidate grammars or parses considered and each associated with some degree of credence. Previous research has used probability theory to formalize these inferences in implemented computational models, especially in phonetics and phonology. In this role, computational models serve to demonstrate the existence of working learning/per- ception/parsing systems assuming a faithful implementation of one particular theory of human language, and are not intended to adjudicate whether that theory is correct. The current thesis (1) develops a theory of the phonetic component of grammar and how it relates to the greater phonological system and (2) uses a formal Bayesian treatment of learning to evaluate this theory of the phonological architecture and for making predictions about how the resulting grammars will be organized. The coarse description of the consequence for linguistic theory is that the processes we think of as "allophonic" are actually language-specific, gradient phonetic processes, assigned to the phonetic component of grammar; strict allophones have no representation in the output of the categorical phonological grammar

Digital Repository at the University of Maryland

Recommended from our members

Aspects of emergent cyclicity in language and computation

Author: Krivochen Diego G.
Publication venue
Publication date
Field of study

This thesis has four parts, which correspond to the presentation and development of a theoretical framework for the study of cognitive capacities qua physical phenomena, and a case study of locality conditions over natural languages. Part I deals with computational considerations, setting the tone of the rest of the thesis, and introducing and defining critical concepts like ‘grammar’, ‘automaton’, and the relations between them . Fundamental questions concerning the place of formal language theory in linguistic inquiry, as well as the expressibility of linguistic and computational concepts in common terms, are raised in this part. Part II further explores the issues addressed in Part I with particular emphasis on how grammars are implemented by means of automata, and the properties of the formal languages that these automata generate. We will argue against the equation between effective computation and function-based computation, and introduce examples of computable procedures which are nevertheless impossible to capture using traditional function-based theories. The connection with cognition will be made in the light of dynamical frustrations: the irreconciliable tension between mutually incompatible tendencies that hold for a given dynamical system. We will provide arguments in favour of analyzing natural language as emerging from a tension between different systems (essentially, semantics and morpho-phonology) which impose orthogonal requirements over admissible outputs. The concept of level of organization or scale comes to the foreground here; and apparent contradictions and incommensurabilities between concepts and theories are revisited in a new light: that of dynamical nonlinear systems which are fundamentally frustrated. We will also characterize the computational system that emerges from such an architecture: the goal is to get a syntactic component which assigns the simplest possible structural description to sub-strings, in terms of its computational complexity. A system which can oscillate back and forth in the hierarchy of formal languages in assigning structural representations to local domains will be referred to as a computationally mixed system. Part III is where the really fun stuff starts. Field theory is introduced, and its applicability to neurocognitive phenomena is made explicit, with all due scale considerations. Physical and mathematical concepts are permanently interacting as we analyze phrase structure in terms of pseudo-fractals (in Mandelbrot’s sense) and define syntax as a (possibly unary) set of topological operations over completely Hausdorff (CH) ultrametric spaces. These operations, which makes field perturbations interfere, transform that initial completely Hausdorff ultrametric space into a metric, Hausdorff space with a weaker separation axiom. Syntax, in this proposal, is not ‘generative’ in any traditional sense –except the ‘fully explicit theory’ one-: rather, it partitions (technically, ‘parametrizes’) a topological space. Syntactic dependencies are defined as interferences between perturbations over a field, which reduce the total entropy of the system per cycles, at the cost of introducing further dimensions where attractors corresponding to interpretations for a phrase marker can be found. Part IV is a sample of what we can gain by further pursuing the physics of language approach, both in terms of empirical adequacy and theoretical elegance, not to mention the unlimited possibilities of interdisciplinary collaboration. In this section we set our focus on island phenomena as defined by Ross (1967), critically revisiting the most relevant literature on this topic, and establishing a typology of constructions that are strong islands, which cannot be violated. These constructions are particularly interesting because they limit the phase space of what is expressible via natural language, and thus reveal crucial aspects of its underlying dynamics. We will argue that a dynamically frustrated system which is characterized by displaying mixed computational dependencies can provide straightforward characterizations of cyclicity in terms of changes in dependencies in local domains

Central Archive at the University of Reading

Formal Linguistic Models and Knowledge Processing. A Structuralist Approach to Rule-Based Ontology Learning and Population

Author: Di Buono Maria Pia
Publication venue: Universita degli studi di Salerno
Publication date: 02/03/2016
Field of study

2013 - 2014The main aim of this research is to propose a structuralist approach for knowledge processing by means of ontology learning and population, achieved starting from unstructured and structured texts. The method suggested includes distributional semantic approaches and NL formalization theories, in order to develop a framework, which relies upon deep linguistic analysis... [edited by author]XIII n.s

EleA@UniSA - Università degli Studi di Salerno

Proceedings of the Joint Ontology Workshops 2018 Episode IV: The South African Spring

Author: Dagmar Gromann
Ludger Jensen
Radicioni Daniele P.
Publication venue: http://ceur-ws.org
Publication date: 01/01/2018
Field of study

Institutional Research Information System University of Turin

Head-Driven Phrase Structure Grammar

Author
Publication venue: Language Science Press
Publication date: 27/01/2022
Field of study

Head-Driven Phrase Structure Grammar (HPSG) is a constraint-based or declarative approach to linguistic knowledge, which analyses all descriptive levels (phonology, morphology, syntax, semantics, pragmatics) with feature value pairs, structure sharing, and relational constraints. In syntax it assumes that expressions have a single relatively simple constituent structure. This volume provides a state-of-the-art introduction to the framework. Various chapters discuss basic assumptions and formal foundations, describe the evolution of the framework, and go into the details of the main syntactic phenomena. Further chapters are devoted to non-syntactic levels of description. The book also considers related fields and research areas (gesture, sign languages, computational linguistics) and includes chapters comparing HPSG with other frameworks (Lexical Functional Grammar, Categorial Grammar, Construction Grammar, Dependency Grammar, and Minimalism)

Directory of Open Access Books (DOAB)

Topological Foundations of Cognitive Science

Author: Eschenbach Carola
Habel Christopher
Smith Barry
Publication venue
Publication date: 01/01/1984
Field of study

A collection of papers presented at the First International Summer Institute in Cognitive Science, University at Buffalo, July 1994, including the following papers: ** Topological Foundations of Cognitive Science, Barry Smith ** The Bounds of Axiomatisation, Graham White ** Rethinking Boundaries, Wojciech Zelaniec ** Sheaf Mereology and Space Cognition, Jean Petitot ** A Mereotopological Definition of 'Point', Carola Eschenbach ** Discreteness, Finiteness, and the Structure of Topological Spaces, Christopher Habel ** Mass Reference and the Geometry of Solids, Almerindo E. Ojeda ** Defining a 'Doughnut' Made Difficult, N .M. Gotts ** A Theory of Spatial Regions with Indeterminate Boundaries, A.G. Cohn and N.M. Gotts ** Mereotopological Construction of Time from Events, Fabio Pianesi and Achille C. Varzi ** Computational Mereology: A Study of Part-of Relations for Multi-media Indexing, Wlodek Zadrozny and Michelle Ki

PhilPapers