8,621 research outputs found

    A testsuite for testing parser performance on complex German grammatical constructions [TePaCoC - a corpus for testing parser performance on complex German grammatical constructions]

    Get PDF
    Traditionally, parsers are evaluated against gold standard test data. This can cause problems if there is a mismatch between the data structures and representations used by the parser and the gold standard. A particular case in point is German, for which two treebanks (TiGer and TüBa-D/Z) are available with highly different annotation schemes for the acquisition of (e.g.) PCFG parsers. The differences between the TiGer and TüBa-D/Z annotation schemes make fair and unbiased parser evaluation difficult [7, 9, 12]. The resource (TEPACOC) presented in this paper takes a different approach to parser evaluation: instead of providing evaluation data in a single annotation scheme, TEPACOC uses comparable sentences and their annotations for 5 selected key grammatical phenomena (with 20 sentences each per phenomena) from both TiGer and TüBa-D/Z resources. This provides a 2 times 100 sentence comparable testsuite which allows us to evaluate TiGer-trained parsers against the TiGer part of TEPACOC, and TüBa-D/Z-trained parsers against the TüBa-D/Z part of TEPACOC for key phenomena, instead of comparing them against a single (and potentially biased) gold standard. To overcome the problem of inconsistency in human evaluation and to bridge the gap between the two different annotation schemes, we provide an extensive error classification, which enables us to compare parser output across the two different treebanks. In the remaining part of the paper we present the testsuite and describe the grammatical phenomena covered in the data. We discuss the different annotation strategies used in the two treebanks to encode these phenomena and present our error classification of potential parser errors

    Developmental Stages of Perception and Language Acquisition in a Perceptually Grounded Robot

    Get PDF
    The objective of this research is to develop a system for language learning based on a minimum of pre-wired language-specific functionality, that is compatible with observations of perceptual and language capabilities in the human developmental trajectory. In the proposed system, meaning (in terms of descriptions of events and spatial relations) is extracted from video images based on detection of position, motion, physical contact and their parameters. Mapping of sentence form to meaning is performed by learning grammatical constructions that are retrieved from a construction inventory based on the constellation of closed class items uniquely identifying the target sentence structure. The resulting system displays robust acquisition behavior that reproduces certain observations from developmental studies, with very modest “innate” language specificity

    An agent-based model studying the acquisition of a language system of logical constructions

    Get PDF
    This paper presents an agent-based model that studies the emergence and evolution of a language system of logical constructions, i.e. a vocabulary and a set of grammatical constructions that allows the expression of logical combinations of categories. The model assumes the agents have a common vocabulary for basic categories, the ability to construct logical combinations of categories using Boolean functions, and some general purpose cognitive capacities for invention, adoption, induction and adaptation. But it does not assume the agents have a vocabulary for Boolean functions nor grammatical constructions for expressing such logical combinations of categories through language. The results of the experiments we have performed show that a language system of logical constructions emerges as a result of a process of selforganisation of the individual agents’ interactions when these agents adapt their preferences for vocabulary and grammatical constructions to those they observe are used more often by the rest of the population, and that such a language system is transmitted from one generation to the next.Peer ReviewedPostprint (published version

    Vector spaces for historical linguistics : using distributional semantics to study syntactic productivity in diachrony

    Get PDF
    This paper describes an application of dis- tributional semantics to the study of syn- tactic productivity in diachrony, i.e., the property of grammatical constructions to attract new lexical items over time. By providing an empirical measure of seman- tic similarity between words derived from lexical co-occurrences, distributional se- mantics not only reliably captures how the verbs in the distribution of a construc- tion are related, but also enables the use of visualization techniques and statistical modeling to analyze the semantic develop- ment of a construction over time and iden- tify the semantic determinants of syntactic productivity in naturally occurring data

    Tonal Sandhi in the Yoruba Language

    Get PDF
    This paper looks at a phonological phenomenon called tonal sandhi which is seen as an interaction among tones especially in register tone languages like Yoruba. The work examines various grammatical constructions where this linguistic feature usually occurs. It is observed that tones do not change or displace one another especially when words appear in isolation. Such interaction which often leads to displacement is noticeable in speech within some grammatical constructions. We equally discover that vowel elision leads to tone movement (since tone operates on a different tier, it does not get elided with vowel). This in turn leads to tone displacement. Keywords: tones, perturbation of tone or tonal sandhi, tone placement and tone Assimilation, register tone language. tone elision, floating tone

    Raising Teacher\u27s Grammatical Consciousness on English Medio-passive Constructions

    Get PDF
    Some researchers reported that the EFL learners\u27 ability in understanding and using tense, aspects, and voice of English at the English Department of Universitas Negeri Padang was not academically satisfied yet. Most EFL learners of English Education department were not in “expected” ability in understanding and using appropriate grammatical constructions both in writing and speaking. This condition may give negative ef-fects to the success of EFL learning in Indonesia. It seems that learners\u27 and teachers\u27 grammatical con-sciousness on EFL should be academically and practically raised in such a way that they may have basic and better competency standards. One of stylistic clause constructions in English which is called medio-passive has not yet a well-known construction for many teachers and learners of English in Indonesia. This paper briefly discusses how authentic materials may psychologically and academically raise the grammatical con-sciousness on the medio-passive constructions as part competency standards in EFL. Keywords: medio-passive, grammatical consciousness, authentic materials, competency standard

    A Method for Studying Semantic Construal in Grammatical Constructions with Interpretable Contextual Embedding Spaces

    Full text link
    We study semantic construal in grammatical constructions using large language models. First, we project contextual word embeddings into three interpretable semantic spaces, each defined by a different set of psycholinguistic feature norms. We validate these interpretable spaces and then use them to automatically derive semantic characterizations of lexical items in two grammatical constructions: nouns in subject or object position within the same sentence, and the AANN construction (e.g., `a beautiful three days'). We show that a word in subject position is interpreted as more agentive than the very same word in object position, and that the nouns in the AANN construction are interpreted as more measurement-like than when in the canonical alternation. Our method can probe the distributional meaning of syntactic constructions at a templatic level, abstracted away from specific lexemes

    Developmental inventories using illiterate parents as informants:Communicative Development Inventory (CDI) adaptation for two Kenyan languages

    Get PDF
    Communicative Development Inventories (CDIs, parent-completed language development checklists) are a helpful tool to assess language in children who are unused to interaction with unfamiliar adults. Generally, CDIs are completed in written form, but in developing country settings parents may have insufficient literacy to complete them alone. We designed CDIs to assess language development in children aged 0;8 to 2;4 in two languages used in Coastal communities in Kenya. Measures of vocabulary, gestures, and grammatical constructions were developed using both interviews with parents from varying backgrounds, and vocabulary as well as grammatical constructions from recordings of children's spontaneous speech. The CDIs were then administered in interview format to over 300 families. Reliability and validity ranged from acceptable to excellent, supporting the use of CDIs when direct language testing is impractical, even when children have multiple caregivers and where respondents have low literacy levels

    Predicting language learners' grades in the L1, L2, L3 and L4: the effect of some psychological and sociocognitive variables

    Get PDF
    This study of 89 Flemish high-school students' grades for L1 (Dutch), L2 (French), L3 (English) and L4 (German) investigates the effects of three higher-level personality dimensions (psychoticism, extraversion, neuroticism), one lower-level personality dimension (foreign language anxiety) and sociobiographical variables (gender, social class) on the participants' language grades. Analyses of variance revealed no significant effects of the higher-level personality dimensions on grades. Participants with high levels of foreign language anxiety obtained significantly lower grades in the L2 and L3. Gender and social class had no effect. Strong positive correlations between grades in the different languages could point to an underlying sociocognitive dimension. The implications of these findings are discussed
    corecore