12 research outputs found

    Lexical Licensing in Formal Grammar

    Get PDF
    This paper discusses instances of restricted combinability of lexical items (words and multi-word units) with their contexts. Different subtypes of distributional idiosyncrasies are presented, which occur on the phonological, morpho-syntactical, and semantic levels. Notably, external sandhi, cranberry words, decomposable idioms and (idiosyncratic) polarity items are addressed. These phenomena reveal an interesting interplay with regular language as well as between the different levels themselves. A detailed lexicalist analysis is provided within a formal grammar framework, Head-Driven Phrase Structure Grammar. This approach motivates an architecture of grammar that includes a module to accommodate specific restrictions on the occurrence environment of a lexical unit

    Spotting, collecting and documenting negative polarity items

    Get PDF
    As the nature of negative polarity items (NPIs) and their licensing contexts is still under much debate, a broad empirical basis is an important cornerstone to support further insights in this area of research. The work discussed in this paper is intended as a contribution to realizing this objective. The authors briefly introduce the phenomenon of NPIs and outline major theories about their licensing and also various licensing contexts before discussing our major topics: Firstly, a corpus-based retrieval method for NPI candidates is described that ranks the candidates according to their distributional dependence on the licensing contexts. Our method extracts single-word candidates and is extended to also capture multi-word candidates. The basic idea for automatically collecting NPI candidates from a large corpus is that an NPI behaves like a kind of collocate to its licensing contexts. Manual inspection and interpretation of the candidate lists identify the actual NPIs. Secondly, an online repository for NPIs and other items that show distributional idiosyncrasies is presented, which offers an empirical database for further (theoretical) research on these items in a sustainable way

    Proceedings of the Workshop on Negation and Polarity

    Get PDF
    This volume contains the papers presented at the Workshop on Negation and Polarity, held in Tübingen, March 8 – 10, 2007. They focus on the syntax, semantics, and pragmatics of negation and polarity items. Both topics have been central to linguistic study in the last few decades. The reason for this is that these phenomena are to some extent universal: Every language has some mode to express negation and some set of lexical elements or (idiomatic) expressions that can only be felicitously uttered in negative contexts. However, languages exhibit strong differences with respect to the way this is executed. Hence, the study of negation and polarity phenomena requires on the one hand in-depth studies of the syntax, semantics, and pragmatics in particular languages, whereas on the other hand typological research of cross-linguistic differences is to be carried out. Especially the latter involves the application of linguistic database systems to collect and categorize data, observed in either the literature or during fieldwork. These proceedings not only contain a rich collection of different investigations on the above-mentioned phenomena, but also represent what is currently going on in the process of obtaining a better understanding of negation and polarity and therefore provide a proper overview of the state of the art in this branch of linguistics and philosophy

    Requirements of a user-friendly, general-purpose corpus query interface

    Get PDF
    Abstract This article reports on a survey that was conducted among 16 projects of a collaborative research centre to learn about the requirements of a web-based corpus query interface. This interface is to be created for a collection of corpora that are heterogeneous with respect to their languages, levels of annotations, and their users' research interests. Based on the survey and a comparison of three existing corpus query interfaces we compiled a set of requirements. In the context of sustainable strategies of corpus storage and accessibility we point out how to design an interface that is general enough to cover multiple corpora and at the same time suitable for a wide range of users

    Cranberry Expressions in English and in German

    Get PDF
    The authors describe two data sets submitted to the database of MWE evaluation resources: (1) cranberry expressions in English and (2) cranberry expressions in German. The first package contains a collection of 444 cranberry words in German (CWde.txt) and a collection of the corresponding cranberry expressions (CCde.txt). The second package consists of a collection of 77 cranberry words in English (CWen.txt) and a collection of the corresponding cranberry expressions (CCen.txt). The data included in these packages was extracted from the Collection of Distributionally Idiosyncratic Items (CoDII), an electronic linguistic resource of lexical items with idiosyncratic occurrence patterns. Each package contains a readme file, and can be downloaded from multiword.wiki.sourceforge.net/Resources

    On Idiom Parts and their Contexts

    No full text
    This article examines idiomatic expressions as sources of both regularity and irregularity in language. Some morphological, lexical, syntactical, and semantical characteristics of idioms are discussed. It is shown how a lexical licensing mechanism, which is formulated within a formal grammar framework, can deal with the data. After that, this proposal is extended to the phenomenon of negative polarity

    On Idiom Parts and their Contexts

    Get PDF
    This article examines idiomatic expressions as sources of both regularity and irregularity in language. Some morphological, lexical, syntactical, and semantical characteristics of idioms are discussed. It is shown how a lexical licensing mechanism, which is formulated within a formal grammar framework, can deal with the data. After that, this proposal is extended to the phenomenon of negative polarity

    A Multilingual Database of Polarity Items

    No full text
    This paper presents three electronic collections of polarity items: (i) negative polarity items in Romanian, (ii) negative polarity items in German, and (iii) positive polarity items in German. The presented collections are a part of a linguistic resource on lexical units with highly idiosyncratic occurrence patterns. The motivation for collecting and documenting polarity items was to provide a solid empirical basis for linguistic investigations of these expressions. Our databe provides general information about the collected items, specifies their syntactic properties, and describes the environment that licenses a given item. For each licensing context, examples from various corpora and the Internet are introduced. Finally, the type of polarity (negative or positive) and the class (superstrong, strong, weak or open) associated with a given item is speci ed. Our database is encoded in XML and is available via the Internet, offering dynamic and exible access

    A Multilingual Electronic Database of Distributionally Idiosyncratic Items

    Get PDF
    The authors present a multilingual electronic database of lexical items with idiosyncratic occurrence patterns. Currently, our database consists of: (1) a collection of 444 bound words in German; (2) a collection of 77 bound words in English; (3) a collection of 58 negative polarity items in Romanian; (4) a collection of 84 negative polarity items in German; and (5) a collection of 52 positive polarity items in German. The database is encoded in XML and is available via the Internet, offering dynamic and flexible access
    corecore