18,412 research outputs found

    ‘So is your mom as cute as you?’: Examining patterns of language use in online sexual grooming of children

    Get PDF
    Linguistic research into online grooming is scarce despite both the communicative essence of this form of online child sexual abuse and a substantial body of literature into it across other Social Sciences. Most of this literature has examined small data sets via qualitative methods, primarily thematic analysis; the exception being a couple of studies that have used automated software (Linguistic Inquiry and Word Count - LIWC) that operates at a single-word level. This study evaluates the contribution that a Corpus Assisted Discourse Studies (CADS) approach can make to this body of literature, with a focus on online groomers’ language. The corpus consists of >600 grooming chat logs taken from the Perverted Justice Foundation archive, from which the groomers’ language was extracted (c. 3.3 million words). Lexical dispersion (DPNorm), collocation and concordance analyses were conducted. The corpus was also run through LIWC. Our analysis shows that LIWC may not be the most efficient software to analyse online grooming language due to a lack of general language comparison scores, the non-transparency of some of its analytic variables and a focus on de-contextualised words. Comparatively, CADS methods can shed light upon online groomers’ strategic use of language. They can also reveal the complex and nuanced ways in which discourse features such as (im)explicitness and interpersonal (in)directness operate alongside these strategies

    Structural Stability of Lexical Semantic Spaces: Nouns in Chinese and French

    Full text link
    Many studies in the neurosciences have dealt with the semantic processing of words or categories, but few have looked into the semantic organization of the lexicon thought as a system. The present study was designed to try to move towards this goal, using both electrophysiological and corpus-based data, and to compare two languages from different families: French and Mandarin Chinese. We conducted an EEG-based semantic-decision experiment using 240 words from eight categories (clothing, parts of a house, tools, vehicles, fruits/vegetables, animals, body parts, and people) as the material. A data-analysis method (correspondence analysis) commonly used in computational linguistics was applied to the electrophysiological signals. The present cross-language comparison indicated stability for the following aspects of the languages' lexical semantic organizations: (1) the living/nonliving distinction, which showed up as a main factor for both languages; (2) greater dispersion of the living categories as compared to the nonliving ones; (3) prototypicality of the \emph{animals} category within the living categories, and with respect to the living/nonliving distinction; and (4) the existence of a person-centered reference gradient. Our electrophysiological analysis indicated stability of the networks at play in each of these processes. Stability was also observed in the data taken from word usage in the languages (synonyms and associated words obtained from textual corpora).Comment: 17 pages, 4 figure

    Implicit dialogical premises, explanation as argument: a corpus-based reconstruction

    Get PDF
    This paper focuses on an explanation in a newspaper article: why new European Union citizens will come to the UK from Eastern Europe (e.g., because of available jobs). Using a corpus-based method of analysis, I show how regular target readers have been positioned to generate premises in dialogue with the explanation propositions, and thus into an understanding of the explanation as an argument, one which contains a biased conclusion not apparent in the text. Employing this method, and in particular ‘corpus comparative statistical keywords’, I show how two issues can be freshly looked at: implicit premise recovery; the argument/explanation distinction

    A Quantitative Corpus-based Analysis of Linking Adverbials in Students’ Academic Writing

    Get PDF
    Udostępnienie publikacji Wydawnictwa Uniwersytetu Ɓódzkiego finansowane w ramach projektu „DoskonaƂoƛć naukowa kluczem do doskonaƂoƛci ksztaƂcenia”. Projekt realizowany jest ze ƛrodków Europejskiego Funduszu SpoƂecznego w ramach Programu Operacyjnego Wiedza Edukacja Rozwój; nr umowy: POWER.03.05.00-00-Z092/17-00

    What is corpus linguistics? What the data says

    Get PDF
    Stubbs (2006), in his state of the art overview, draws attention to the frequent reticence or vagueness of corpus analysts in discussing their operational methods within a scientific context, (a context addressed in detail in Partington forthcoming). This lack of clarity in discussing the methodological framework employed is, perhaps, most surprising given the way in which corpus linguistics situates itself within a scientific frame, and lays such claims to a scientific nature. This brief paper, then, addresses the question posed in its title, namely, “What is corpus linguistics?” – is it a discipline, a methodology, a paradigm or none or all of these? – but does not attempt to offer any definitive answers. Rather, the aim is to present the reader with a number of observations on how corpus linguistics has been construed in its own literature and then to leave the question open, in the hope of stimulating further discussion. The study takes the specific term corpus linguistics and looks at how it is defined and described both explicitly and implicitly in a variety of relevant sources

    Text Classification Using Association Rules, Dependency Pruning and Hyperonymization

    Full text link
    We present new methods for pruning and enhancing item- sets for text classification via association rule mining. Pruning methods are based on dependency syntax and enhancing methods are based on replacing words by their hyperonyms of various orders. We discuss the impact of these methods, compared to pruning based on tfidf rank of words.Comment: 16 pages, 2 figures, presented at DMNLP 201
    • 

    corecore