1,748 research outputs found

    The role of word frequency and morpho-orthography in agreement processing

    No full text
    Agreement attraction in comprehension (when an ungrammatical verb is read quickly if preceded by a feature-matching local noun) is well described by a cue-based retrieval framework. This suggests a role for lexical retrieval in attraction. To examine this, we manipulated two probabilistic factors known to affect lexical retrieval: local noun word frequency and morpho-orthography (agreement morphology realised with or without –s endings) in a self-paced reading study. Noun number and word frequency affected noun and verb region reading times, with higher-frequency words not eliciting attraction. Morpho-orthography impacted verb processing but not attraction: atypical plurals led to slower verb reading times regardless of verb number. Exploratory individual difference analyses further underscore the importance of lexical retrieval dynamics in sentence processing. This provides evidence that agreement operates via a cue-based retrieval mechanism over lexical representations that vary in their strength and association to number features

    Polyglot Semantic Parsing in APIs

    Full text link
    Traditional approaches to semantic parsing (SP) work by training individual models for each available parallel dataset of text-meaning pairs. In this paper, we explore the idea of polyglot semantic translation, or learning semantic parsing models that are trained on multiple datasets and natural languages. In particular, we focus on translating text to code signature representations using the software component datasets of Richardson and Kuhn (2017a,b). The advantage of such models is that they can be used for parsing a wide variety of input natural languages and output programming languages, or mixed input languages, using a single unified model. To facilitate modeling of this type, we develop a novel graph-based decoding framework that achieves state-of-the-art performance on the above datasets, and apply this method to two other benchmark SP tasks.Comment: accepted for NAACL-2018 (camera ready version

    Functional Baby Talk: Analysis of Code Fragments from Novice Haskell Programmers

    Get PDF
    What kinds of mistakes are made by novice Haskell developers, as they learn about functional programming? Is it possible to analyze these errors in order to improve the pedagogy of Haskell? In 2016, we delivered a massive open online course which featured an interactive code evaluation environment. We captured and analyzed 161K interactions from learners. We report typical novice developer behavior; for instance, the mean time spent on an interactive tutorial is around eight minutes. Although our environment was restricted, we gain some understanding of Haskell novice errors. Parenthesis mismatches, lexical scoping errors and do block misunderstandings are common. Finally, we make recommendations about how such beginner code evaluation environments might be enhanced

    Improving Hypernymy Extraction with Distributional Semantic Classes

    Full text link
    In this paper, we show how distributionally-induced semantic classes can be helpful for extracting hypernyms. We present methods for inducing sense-aware semantic classes using distributional semantics and using these induced semantic classes for filtering noisy hypernymy relations. Denoising of hypernyms is performed by labeling each semantic class with its hypernyms. On the one hand, this allows us to filter out wrong extractions using the global structure of distributionally similar senses. On the other hand, we infer missing hypernyms via label propagation to cluster terms. We conduct a large-scale crowdsourcing study showing that processing of automatically extracted hypernyms using our approach improves the quality of the hypernymy extraction in terms of both precision and recall. Furthermore, we show the utility of our method in the domain taxonomy induction task, achieving the state-of-the-art results on a SemEval'16 task on taxonomy induction.Comment: In Proceedings of the 11th Conference on Language Resources and Evaluation (LREC 2018). Miyazaki, Japa

    The C++0x "Concepts" Effort

    Full text link
    C++0x is the working title for the revision of the ISO standard of the C++ programming language that was originally planned for release in 2009 but that was delayed to 2011. The largest language extension in C++0x was "concepts", that is, a collection of features for constraining template parameters. In September of 2008, the C++ standards committee voted the concepts extension into C++0x, but then in July of 2009, the committee voted the concepts extension back out of C++0x. This article is my account of the technical challenges and debates within the "concepts" effort in the years 2003 to 2009. To provide some background, the article also describes the design space for constrained parametric polymorphism, or what is colloquially know as constrained generics. While this article is meant to be generally accessible, the writing is aimed toward readers with background in functional programming and programming language theory. This article grew out of a lecture at the Spring School on Generic and Indexed Programming at the University of Oxford, March 2010

    Constraint Generation for the Jeeves Privacy Language

    Get PDF
    Our goal is to present a completed, semantic formalization of the Jeeves privacy language evaluation engine, based on the original Jeeves constraint semantics defined by Yang et al at POPL12, but sufficiently strong to support a first complete implementation thereof. Specifically, we present and implement a syntactically and semantically completed concrete syntax for Jeeves that meets the example criteria given in the paper. We also present and implement the associated translation to J, but here formulated by a completed and decompositional operational semantic formulation. Finally, we present an enhanced and decompositional, non-substitutional operational semantic formulation and implementation of the J evaluation engine (the dynamic semantics) with privacy constraints. In particular, we show how implementing the constraints can be defined as a monad, and evaluation can be defined as monadic operation on the constraint environment. The implementations are all completed in Haskell, utilizing its almost one-to-one capability to transparently reflect the underlying semantic reasoning when formalized this way. In practice, we have applied the "literate" program facility of Haskell to this report, a feature that enables the source LATEX to also serve as the source code for the implementation (skipping the report-parts as comment regions). The implementation is published as a github project

    Finding The Lazy Programmer's Bugs

    Get PDF
    Traditionally developers and testers created huge numbers of explicit tests, enumerating interesting cases, perhaps biased by what they believe to be the current boundary conditions of the function being tested. Or at least, they were supposed to. A major step forward was the development of property testing. Property testing requires the user to write a few functional properties that are used to generate tests, and requires an external library or tool to create test data for the tests. As such many thousands of tests can be created for a single property. For the purely functional programming language Haskell there are several such libraries; for example QuickCheck [CH00], SmallCheck and Lazy SmallCheck [RNL08]. Unfortunately, property testing still requires the user to write explicit tests. Fortunately, we note there are already many implicit tests present in programs. Developers may throw assertion errors, or the compiler may silently insert runtime exceptions for incomplete pattern matches. We attempt to automate the testing process using these implicit tests. Our contributions are in four main areas: (1) We have developed algorithms to automatically infer appropriate constructors and functions needed to generate test data without requiring additional programmer work or annotations. (2) To combine the constructors and functions into test expressions we take advantage of Haskell's lazy evaluation semantics by applying the techniques of needed narrowing and lazy instantiation to guide generation. (3) We keep the type of test data at its most general, in order to prevent committing too early to monomorphic types that cause needless wasted tests. (4) We have developed novel ways of creating Haskell case expressions to inspect elements inside returned data structures, in order to discover exceptions that may be hidden by laziness, and to make our test data generation algorithm more expressive. In order to validate our claims, we have implemented these techniques in Irulan, a fully automatic tool for generating systematic black-box unit tests for Haskell library code. We have designed Irulan to generate high coverage test suites and detect common programming errors in the process

    Inflectional morphology and compounding in English : a single route, associative memory based account

    Get PDF
    Native English speakers include irregular plurals in English compounds (e. g., mice chaser) more frequently than regular plurals (e. g., *rats chaser) (Gordon, 1985). This dissociation in inflectional morphology has been argued to stem from an internal and innate morphological constraint as it is thought that the input to which English speaking children are exposed is insufficient to signal that regular plurals are prohibited in compounds but irregulars might be allowed (Marcus, Brinkmann, Clahsen, Weise & Pinker, 1995). In addition, this dissociation in English compounds has been invoked to support the idea that regular and irregular morphology are mediated by separate cognitive systems (Pinker, 1999). It is argued in this thesis however, that the constraint on English compounds can be derived from the general frequencies and patterns in which the two types of plural (regular and irregular) and the possessive morpheme occur in the input. In English both plurality (on regular nouns) and possession are denoted by a [-s] morpheme. It is argued that the constraint on the use of plurals in English compounds occurs because of competition between these two identical morphemes. Regular plurals are excluded before a second noun because the pattern -noun-[-sJ morpheme- noun- is reserved for marking possession in English. Irregular plurals do not end in the [-s] morpheme and as such do not compete with the possessive marker and consequently may be optionally included in compounds. Interestingly, plurals are allowed in compounds in other languages where this competitive relationship does not exist (e. g. Dutch (Schreuder, Neijt, van der Weide & Baayen, 1998) and French (Murphy, 2000). As well as not being in competition with the possessive structure irregular plurals also occur relatively infrequently in the input compared to regular plurals. This imbalance between the frequency of regular and irregular plurals in compounds also affects the way the two types of plural are treated in compounds. Thus there is no need for an innate mechanism to explain the treatment of plurals in English compounds. There is enough evidence available in the input to constrain the formation of compound words in English
    • …
    corecore