12,059 research outputs found

    Morphological paradigms in language processing and language disorders

    Get PDF
    We present results from two cross-modal morphological priming experiments investigating regular person and number inflection on finite verbs in German. We found asymmetries in the priming patterns between different affixes that can be predicted from the structure of the paradigm. We also report data from language disorders which indicate that inflectional errors produced by language-impaired adults and children tend to occur within a given paradigm dimension, rather than randomly across the paradigm. We conclude that morphological paradigms are used by the human language processor and can be systematically affected in language disorders

    From Frequency to Meaning: Vector Space Models of Semantics

    Full text link
    Computers understand very little of the meaning of human language. This profoundly limits our ability to give instructions to computers, the ability of computers to explain their actions to us, and the ability of computers to analyse and process text. Vector space models (VSMs) of semantics are beginning to address these limits. This paper surveys the use of VSMs for semantic processing of text. We organize the literature on VSMs according to the structure of the matrix in a VSM. There are currently three broad classes of VSMs, based on term-document, word-context, and pair-pattern matrices, yielding three classes of applications. We survey a broad range of applications in these three categories and we take a detailed look at a specific open source project in each category. Our goal in this survey is to show the breadth of applications of VSMs for semantics, to provide a new perspective on VSMs for those who are already familiar with the area, and to provide pointers into the literature for those who are less familiar with the field

    Lexical typology : a programmatic sketch

    Get PDF
    The present paper is an attempt to lay the foundation for Lexical Typology as a new kind of linguistic typology.1 The goal of Lexical Typology is to investigate crosslinguistically significant patterns of interaction between lexicon and grammar

    Sharing Cultural Heritage: the Clavius on the Web Project

    Get PDF
    In the last few years the amount of manuscripts digitized and made available on the Web has been constantly increasing. However, there is still a considarable lack of results concerning both the explicitation of their content and the tools developed to make it available. The objective of the Clavius on the Web project is to develop a Web platform exposing a selection of Christophorus Clavius letters along with three different levels of analysis: linguistic, lexical and semantic. The multilayered annotation of the corpus involves a XML-TEI encoding followed by a tokenization step where each token is univocally identified through a CTS urn notation and then associated to a part-of-speech and a lemma. The text is lexically and semantically annotated on the basis of a lexicon and a domain ontology, the former structuring the most relevant terms occurring in the text and the latter representing the domain entities of interest (e.g. people, places, etc.). Moreover, each entity is connected to linked and non linked resources, including DBpedia and VIAF. Finally, the results of the three layers of analysis are gathered and shown through interactive visualization and storytelling techniques. A demo version of the integrated architecture was developed

    Ontologies and Information Extraction

    Full text link
    This report argues that, even in the simplest cases, IE is an ontology-driven process. It is not a mere text filtering method based on simple pattern matching and keywords, because the extracted pieces of texts are interpreted with respect to a predefined partial domain model. This report shows that depending on the nature and the depth of the interpretation to be done for extracting the information, more or less knowledge must be involved. This report is mainly illustrated in biology, a domain in which there are critical needs for content-based exploration of the scientific literature and which becomes a major application domain for IE

    Evaluating phonological status : significance of paradigm uniformity vs. prosodic group effects

    Get PDF
    A central concern of linguistic phonetics is to define criteria for determining the phonological status of sounds or sound properties observed in phonetic surface form. Based on acoustic measurements we show that the occurrence of syllabic sonorants vs. schwa-sonorant sequences in German is determined exclusively by segmental and prosodic structure, with no paradigm uniformity effects. We argue that these findings are consistent with a uniform representation of syllabic sonorants as schwa sonorant sequences in the lexicon. The stability of schwa in CVC-suffixes (e.g. the German diminutive suffix -chen), as opposed to its phonetic absence in a segmentally comparable underived context, is argued to be conditioned by the prosodic organisation of such suffixes external to the phonological word of the stem

    Effects of Lexical Class and Word Frequency on the L1 and L2 English-Based Lexical Connections

    Get PDF
    Three groups of participants—L1 speakers of English, L2 advanced, and intermediate users of English—responded in writing to a word association test containing words balanced for lexical class (nouns, verbs, adjectives) and frequency of occurrence (high, mid, low). The questions addressed in the study concerned the way two word-related factors (i.e., lexical category and word frequency) interplayed with two learner-related characteristics (i.e., proficiency and word familiarity) and influenced 1) the participants’ knowledge of vocabulary, 2) their preference to build specific types of lexical connections among the words they know, and 3) their ability to maintain networks of associations as an indicator of the connectivity of their lexicons. The findings revealed a complex picture of interactions between the word-related and learner-related factors but, whenever the effects of the variables could be disentangled, proficiency and lexical class had a stronger influence on the organization of the L1 and L2 lexicons than word frequency alone

    Implanting Rational Knowledge into Distributed Representation at Morpheme Level

    Full text link
    Previously, researchers paid no attention to the creation of unambiguous morpheme embeddings independent from the corpus, while such information plays an important role in expressing the exact meanings of words for parataxis languages like Chinese. In this paper, after constructing the Chinese lexical and semantic ontology based on word-formation, we propose a novel approach to implanting the structured rational knowledge into distributed representation at morpheme level, naturally avoiding heavy disambiguation in the corpus. We design a template to create the instances as pseudo-sentences merely from the pieces of knowledge of morphemes built in the lexicon. To exploit hierarchical information and tackle the data sparseness problem, the instance proliferation technique is applied based on similarity to expand the collection of pseudo-sentences. The distributed representation for morphemes can then be trained on these pseudo-sentences using word2vec. For evaluation, we validate the paradigmatic and syntagmatic relations of morpheme embeddings, and apply the obtained embeddings to word similarity measurement, achieving significant improvements over the classical models by more than 5 Spearman scores or 8 percentage points, which shows very promising prospects for adoption of the new source of knowledge.Comment: AAAI 201
    corecore