531 research outputs found

    A Sound and Complete Left-Corner Parsing for Minimalist Grammars

    Get PDF

    Wide-coverage statistical parsing with minimalist grammars

    Get PDF
    Syntactic parsing is the process of automatically assigning a structure to a string of words, and is arguably a necessary prerequisite for obtaining a detailed and precise representation of sentence meaning. For many NLP tasks, it is sufficient to use parsers based on simple context free grammars. However, for tasks in which precision on certain relatively rare but semantically crucial constructions (such as unbounded wh-movements for open domain question answering) is important, more expressive grammatical frameworks still have an important role to play. One grammatical framework which has been conspicuously absent from journals and conferences on Natural Language Processing (NLP), despite continuing to dominate much of theoretical syntax, is Minimalism, the latest incarnation of the Transformational Grammar (TG) approach to linguistic theory developed very extensively by Noam Chomsky and many others since the early 1950s. Until now, all parsers using genuine transformational movement operations have had only narrow coverage by modern standards, owing to the lack of any wide-coverage TG grammars or treebanks on which to train statistical models. The received wisdom within NLP is that TG is too complex and insufficiently formalised to be applied to realistic parsing tasks. This situation is unfortunate, as it is arguably the most extensively developed syntactic theory across the greatest number of languages, many of which are otherwise under-resourced, and yet the vast majority of its insights never find their way into NLP systems. Conversely, the process of constructing large grammar fragments can have a salutary impact on the theory itself, forcing choices between competing analyses of the same construction, and exposing incompatibilities between analyses of different constructions, along with areas of over- and undergeneration which may otherwise go unnoticed. This dissertation builds on research into computational Minimalism pioneered by Ed Stabler and others since the late 1990s to present the first ever wide-coverage Minimalist Grammar (MG) parser, along with some promising initial experimental results. A wide-coverage parser must of course be equipped with a wide-coverage grammar, and this dissertation will therefore also present the first ever wide-coverage MG, which has analyses with a high level of cross-linguistic descriptive adequacy for a great many English constructions, many of which are taken or adapted from proposals in the mainstream Minimalist literature. The grammar is very deep, in the sense that it describes many long-range dependencies which even most other expressive wide-coverage grammars ignore. At the same time, it has also been engineered to be highly constrained, with continuous computational testing being applied to minimize both under- and over-generation. Natural language is highly ambiguous, both locally and globally, and even with a very strong formal grammar, there may still be a great many possible structures for a given sentence and its substrings. The standard approach to resolving such ambiguity is to equip the parser with a probability model allowing it to disregard certain unlikely search paths, thereby increasing both its efficiency and accuracy. The most successful parsing models are those extracted in a supervised fashion from labelled data in the form of a corpus of syntactic trees, known as a treebank. Constructing such a treebank from scratch for a different formalism is extremely time-consuming and expensive, however, and so the standard approach is to map the trees in an existing treebank into trees of the target formalism. Minimalist trees are considerably more complex than those of other formalisms, however, containing many more null heads and movement operations, making this conversion process far from trivial. This dissertation will describe a method which has so far been used to convert 56% of the Penn Treebank trees into MG trees. Although still under development, the resulting MGbank corpus has already been used to train a statistical A* MG parser, described here, which has an expected asymptotic time complexity of O(n3); this is much better than even the most optimistic worst case analysis for the formalism

    Parameters of Cross-linguistic Variation in Expectation-based Minimalist Grammars (e-MGs)

    Get PDF
    The fact that Parsing and Generation share the same grammatical knowledge is often considered the null hypothesis (Momma and Phillips 2018) but very few algorithms can take advantage of a cognitively plausible incremental procedure that operates roughly in the way words are produced and understood in real time. This is especially difficult if we consider cross-linguistic variation that has a clear impact on word order. In this paper, I present one such formalism, dubbed Expectation-based Minimalist Grammar (e-MG), that qualifies as a simplified version of the (Conflated) Minimalist Grammars, (C)MGs (Stabler 1997, 2011, 2013), and Phase-based Minimalist Grammars, PMGs (Chesi 2005, 2007; Stabler 2011). The crucial simplification consists of driving structure building only using lexically encoded categorial top-down expectations. The commitment to the top-down procedure (in e-MGs and PMGs, as opposed to (C)MGs, ) will be crucial to capture a relevant set of empirical asymmetries in a parameterized cross-linguistic perspective which represents the least common denominator of structure building in both Parsing and Generation

    Geometric representations for minimalist grammars

    Full text link
    We reformulate minimalist grammars as partial functions on term algebras for strings and trees. Using filler/role bindings and tensor product representations, we construct homomorphisms for these data structures into geometric vector spaces. We prove that the structure-building functions as well as simple processors for minimalist languages can be realized by piecewise linear operators in representation space. We also propose harmony, i.e. the distance of an intermediate processing step from the final well-formed state in representation space, as a measure of processing complexity. Finally, we illustrate our findings by means of two particular arithmetic and fractal representations.Comment: 43 pages, 4 figure

    Great Bay Estuary Restoration Compendium

    Get PDF
    Single species approaches to natural resource conservation and management are now viewed as antiquated and oversimplified for dealing with complex systems. Scientists and managers who work in estuaries and other marine systems have urged adoption of ecosystem based approaches to management for nearly a decade, yet practitioners are still struggling to translate the ideas into practice. Similarly, ecological restoration projects in coastal systems have typically addressed one species or habitat. In recent years, efforts to focus on multiple species and habitats have increased. Our project developed an integrated ecosystem approach to identify multi-habitat restoration opportunities in the Great Bay estuary, New Hampshire. We created a conceptual site selection model based on a comparison of historic and modern distribution and abundance data, current environmental conditions, and expert review. Restoration targets included oysters and softshell clams, salt marshes, eelgrass beds, and seven diadromous fish species. Spatial data showing the historical and present day distributions for multiple species and habitats were compiled and integrated into a geographic information system. A matrix of habitat interactions was developed to identify potential for synergy and subsequent restoration efficiency. Output from the site selection models was considered within this framework to identify ecosystem restoration landscapes. The final products of these efforts include a series of maps detailing multi-habitat restoration opportunities extending from upland freshwater fish habitat down to the bay bottom. A companion guidance document was created to present project methods and a review of restoration methods. The authors hope that this work will help to stimulate and inform new restoration projects within the Great Bay estuarine system, and that it will serve as a foundation to be updated and improved as more information is collected

    Relating Movement and Adjunction in Syntax and Semantics

    Get PDF
    In this thesis I explore the syntactic and semantic properties of movement and adjunction in natural language, and suggest that these two phenomena are related in a novel way. In a precise sense, the basic pieces of grammatical machinery that give rise to movement, also give rise to adjunction. In the system I propose, there is no atomic movement operation and no atomic adjunction operation; the terms "movement" and "adjunction" serve only as convenient labels for certain combinations of other, primitive operations. As a result the system makes non-trivial predictions about how movement and adjunction should interact, since we do not have the freedom to stipulate arbitrary properties of movement while leaving the properties of adjunction unchanged, or vice-versa. I focus first on the distinction between arguments and adjuncts, and propose that the differences between these two kinds of syntactic attachment can be thought of as a transparent reflection of the differing ways in which they contribute to neo-Davidsonian logical forms. The details of this proposal rely crucially on a distinctive treatment of movement, and from it I derive accurate predictions concerning the equivocal status of adjuncts as optionally included in or excluded from a maximal projection, and the possibility of counter-cyclic adjunction. The treatment of movement and adjunction as interrelated phenomena furthermore enables us to introduce a single constraint that subsumes two conditions on extraction, namely adjunct island effects and freezing effects. The novel conceptions of movement and semantic composition that underlie these results raise questions about the system's ability to handle semantic variable-binding. I give an unconventional but descriptively adequate account of basic quantificational phenomena, to demonstrate that this important empirical ground is not given up. More generally, this thesis constitutes a case study in (i) deriving explanations for syntactic patterns from a restrictive, independently motivated theory of compositional semantics, and (ii) using a computationally explicit framework to rigourously investigate the primitives and consequences of our theories. The emerging picture that is suggested is one where some central facts about the syntax and semantics of natural language hang together in a way that they otherwise would not

    Martian Gully Formation and Evolution: Studies From the Local to Global Scale

    Get PDF
    Gullies in the mid- and high-latitudes of Mars were first observed in Mars Global Surveyor (MGS) Mars Orbiter Camera (MOC) images in 1997. Appearing to be geologically young, they quickly became a feature of interest due to the implication of liquid water in their formation based on distinct morphological characteristics including incised channels, many exhibiting features indicative of fluid flow. However, the temperature and pressure conditions on the surface of Mars during its most recent geologic era have not been conducive to sustaining water in the liquid phase for extended periods of time; therefore, a number of “wet” (water-related) and “dry” (driven by CO2 gas or granular flow) gully formation mechanisms have been proposed. The goal of this thesis is to conduct a large-scale study of gullies on Mars in order to determine how they are likely to have formed and evolved. I begin with a comprehensive global inventory of martian gullies to determine how their geographic distribution correlates with the effects of past and present climate conditions based on recent models, as well as thermophysical properties of the surface. Then I move to a regional focus in Utopia Planitia in Mars’ northern mid-latitudes, using gullies as a stratigraphic marker for the relative timing of formation of other mid-latitude landforms found in the region. Lastly, I take a localized approach within Gasa Crater, a particularly active gully site in the southern mid-latitudes, to investigate methods of looking for recent changes in martian gullies
    • 

    corecore