237 research outputs found

    A statistical method for the identification and aggregation of regional linguistic variation

    Get PDF
    This paper introduces a method for the analysis of regional linguistic variation. The method identifies individual and common patterns of spatial clustering in a set of linguistic variables measured over a set of locations based on a combination of three statistical techniques: spatial autocorrelation, factor analysis, and cluster analysis. To demonstrate how to apply this method, it is used to analyze regional variation in the values of 40 continuously measured, high-frequency lexical alternation variables in a 26-million-word corpus of letters to the editor representing 206 cities from across the United States

    Er als accessibility marker: on- en offline evidentie voor een procedurele duiding van presentatieve zinnen

    Get PDF
    This paper elaborates on offline and online research into linguistic status of non-anaphoric 'er' "there" in presentive sentences with a preposed adjunct (e.g. Op de hoek van de straat is (er) een winkel). In a first study (Grondelaers & Brysbaert 1996), we used corpus materials and self-paced reading data to demonstrate that there is a positive correlation between the presence of 'er' in adjunct sentences and the spatial and discursive situating potential of the preposed locative adjunct: the preference for 'er' in such sentences increases as the locative search precision of the adjunct and the topicality of the adjunct referent decrease. Building on additional corpus data and a new self-paced reading experiment, the present paper goes beyond the observation of correlations, and concentrates on 'er's' exact linguistic function. The cumulative empirical evidence suggests that 'er' is an acciessibility marker in the sense of Ariel (1990): 'er' is not - as is generally assumed - an optional dummy element, but a discourse particle inserted to inform the hearer how important the subject to be created is from a communicative point of view, how inferrable it is from the foregoing context, and how much effort the hearer should invest in its creation

    Argument alternations of the Dutch psych verbs

    Get PDF
    This paper presents a corpus study of the alternation between the reflexive and transitive argument constructions of the Dutch psych verbs ergeren (‘to annoy’), interesseren (‘to interest’), storen (‘to disturb’) and verbazen (‘to amaze’), as in Jij ergert je aan mij vs. Ik erger jou (both ‘I annoy you’). Logistic regression analysis revealed that the choice of the language user was driven by – in order of decreasing importance – the choice of verb, the morphological form of the stimulus, the animacy of the stimulus, the morphological form of the experiencer, and a number of nuisance variables. However, verbs whose lexical meaning entailed a more agentive experiencer did not more often realize this experiencer in subject position than other verbs, nor could the preference of the verbs be predicted by looking at their etymology

    Can social psychological attitude measures be used to study language attitudes? - A case study exploring the Personalized Implicit Association Test

    Get PDF
    In the field of social psychology, a wide range of implicit attitude measures have recently been developed. These measures have hardly been used in linguistic attitude research so far. This paper presents a case study exploring the potential of one of these social psychological measures, the Personalized Implicit Association Test, in order to find out whether it can be useful for the study of language attitudes. In the case study, the Personalized Implicit Association Test is applied to measure attitudes towards regional varieties of Dutch in Belgium and Standard Belgian Dutch

    The influence of semantic features on lexical geographical variation

    Get PDF
    In this paper, we investigate the influence of semantic concept features on lexical geographical variation. More specifically, we take an onomasiological approach to inquire into the effect of concept vagueness, salience, affect and semantic field. We use quantitative operationalizations of these features as predictors in a linear regression analysis. Our response variable is a composite variable that takes into account the number of variants per concept and the degree to which the concepts are scattered across geographical space in a heterogeneous way. Our model reveals that vaguer, less salient and non-neutral concepts show significantly more variation and that the lexical variants for these concepts are scattered across geographical space in a less homogeneous way. We also find differences between semantic fields

    La contribution des cooccurrences de deuxième ordre à l’analyse sémantique

    Get PDF
    Cet article montre ce que la cooccurrence peut nous apprendre sur la monosémie et comment on peut exploiter l’analyse des cooccurrences de deuxième ordre pour quantifier l’analyse sémantique. Les analyses sont conduites sur un corpus technique (1,7 million d’occurrences) relevant du domaine spécialisé des machines-outils pour l’usinage des métaux. Dans cet article, nous expliquons la méthodologie adoptée pour déterminer le degré de monosémie d’un mot technique à partir de l’analyse du recoupement de ses cooccurrences de deuxième ordre. Dans le but d’affiner les résultats de la mesure de recoupement, nous procédons également à quelques expérimentations qui vont au-delà du simple repérage statistique des cooccurrences et qui font varier différents paramètres, tels que la fenêtre d’observation, le seuil de significativité et la forme graphique ou le lemme des cooccurrences de premier et deuxième ordre. Finalement, nous abordons l’importance de l’intégration des étiquettes morphosyntaxiques dans l’analyse des cooccurrences.The Contribution of Second Order Co-occurrences to Semantic AnalysisThis article shows what co-occurrence can learn us about monosemy and explores the contribution of second order co-occurrences to quantitative semantic analysis. The analysis is carried out on a technical corpus (1.7 million occurrences) from the specialised domain of machining and metalworking terminology in French. In this article, we explain the methodology for calculating the degree of monosemy of a technical word based on the overlap of its second order co-occurrences. Next, in order to refine the results of the overlap measure, several experiments are discussed, going beyond statistical detection of co-occurrence patterns and showing the impact of several varying parameters, such as co-occurrence span, significance level and word form or lemma of first and second order co-occurrences. Finally, the article addresses the importance of the integration of POS-tags in co-occurrence analysis

    Generating hypotheses for alternations at low and intermediate levels of schematicity. The use of Memory-Based Learning

    Get PDF
    peer reviewedAccording to usage-based linguistics, language variation addresses a functional need of the language user. That functional need may be dependent on the lexical realization of the varying constructions. For instance, while it may be useful to have an argument structure alternation express a particular semantic distinction for particular verbs or themes, that same distinction may be less relevant for other verbs or themes. As such, it has been argued that language variation should be investigated at low levels of schematicity, e.g. by studying argument structure alternations separately for various verbs, themes, etc. In this paper, we develop a data-driven procedure to do so, based on Memory-based Learning (MBL). The procedure focusses on generating hypotheses, is scalable, and can work with small datasets. It consists of three steps: (i) choosing features for the MBL classifier, (ii) running MBL analyses and selecting which analyses to put under further scrutiny, and (iii) inspecting which features were most useful in predicting the choice of variant in these analyses. Finally, the hypotheses that are inferred from these features are put to the test on separate data. As an example study, we investigate the Dutch naar-alternatio
    • …
    corecore