104,020 research outputs found

    Dlùth is Inneach: Linguistic and Institutional Foundations for Gaelic Corpus Planning

    Get PDF
    This report presents the results of a one-year research project, commissioned by Bòrd na Gàidhlig BnG) and carried out by a Soillse Research team, whose goal was to answer the following question: What corpus planning principles are appropriate for the strengthening and promotion of Scottish Gaelic, and what effective coordination would result in their implementation? This report contains the following agreed outcomes: a clear and consistent linguistic foundation for Gaelic corpus planning, according with Bòrd na Gàidhlig’s acquisition, usage and status planning initiatives, and most likely to be supported by Gaelic users. a programme of priorities to be addressed by Gaelic corpus planning. recommendations on a means of coordination that will be effective in terms of cost and management (i.e. an institutional framework

    Alexandria: Extensible Framework for Rapid Exploration of Social Media

    Full text link
    The Alexandria system under development at IBM Research provides an extensible framework and platform for supporting a variety of big-data analytics and visualizations. The system is currently focused on enabling rapid exploration of text-based social media data. The system provides tools to help with constructing "domain models" (i.e., families of keywords and extractors to enable focus on tweets and other social media documents relevant to a project), to rapidly extract and segment the relevant social media and its authors, to apply further analytics (such as finding trends and anomalous terms), and visualizing the results. The system architecture is centered around a variety of REST-based service APIs to enable flexible orchestration of the system capabilities; these are especially useful to support knowledge-worker driven iterative exploration of social phenomena. The architecture also enables rapid integration of Alexandria capabilities with other social media analytics system, as has been demonstrated through an integration with IBM Research's SystemG. This paper describes a prototypical usage scenario for Alexandria, along with the architecture and key underlying analytics.Comment: 8 page

    Usage Effects on the Cognitive Routinization of Chinese Resultative Verbs

    Get PDF
    The present study adopts a corpus-oriented usage-based approach to the grammar of Chinese resultative verbs. Zooming in on a specific class of V-kai constructions, this paper aims to elucidate the effect of frequency in actual usage events on shaping the linguistic representations of resultative verbs. Specifically, it will be argued that while high token frequency results in more lexicalized V-kai complex verbs, high type frequency gives rise to more schematized V-kai constructions. The routinized patterns pertinent to V-kai resultative verbs varying in their extent of specificity and generality accordingly serve as a representative illustration of the continuum between lexicon and grammar that characterizes a usage-based conception of language

    Variability, negative evidence, and the acquisition of verb argument constructions

    Get PDF
    We present a hierarchical Bayesian framework for modeling the acquisition of verb argument constructions. It embodies a domain-general approach to learning higher-level knowledge in the form of inductive constraints (or overhypotheses), and has been used to explain other aspects of language development such as the shape bias in learning object names. Here, we demonstrate that the same model captures several phenomena in the acquisition of verb constructions. Our model, like adults in a series of artificial language learning experiments, makes inferences about the distributional statistics of verbs on several levels of abstraction simultaneously. It also produces the qualitative learning patterns displayed by children over the time course of acquisition. These results suggest that the patterns of generalization observed in both children and adults could emerge from basic assumptions about the nature of learning. They also provide an example of a broad class of computational approaches that can resolve Baker's Paradox

    Frequency vs. Association for Constraint Selection in Usage-Based Construction Grammar

    Get PDF
    A usage-based Construction Grammar (CxG) posits that slot-constraints generalize from common exemplar constructions. But what is the best model of constraint generalization? This paper evaluates competing frequency-based and association-based models across eight languages using a metric derived from the Minimum Description Length paradigm. The experiments show that association-based models produce better generalizations across all languages by a significant margin
    corecore