57 research outputs found

    Sentiment Inference for Gender Profiling

    Full text link

    A Review and Cluster Analysis of German Polarity Resources for Sentiment Analysis

    Get PDF

    Efficient construction of metadata-enhanced web corpora

    Get PDF
    International audienceMetadata extraction is known to be a problem in general-purpose Web corpora, and so is extensive crawling with little yield. The contributions of this paper are threefold: a method to find and download large numbers of WordPress pages; a targeted extraction of content featuring much needed metadata; and an analysis of the documents in the corpus with insights of actual blog uses. The study focuses on a publishing software (WordPress), which allows for reliable extraction of structural elements such as metadata, posts, and comments. The download of about 9 million documents in the course of two experiments leads after processing to 2.7 billion tokens with usable metadata. This comparatively high yield is a step towards more efficiency with respect to machine power and " Hi-Fi " web corpora. The resulting corpus complies with formal requirements on metadata-enhanced corpora and on weblogs considered as a series of dated entries. However, existing typologies on Web texts have to be revised in the light of this hybrid genre

    Computational emotion classification for genre corpora of German tragedies and comedies from 17th to early 19th century

    Get PDF
    This article presents a method of emotion analysis for German drama from the 17th to the 19th century that significantly goes beyond previous research approaches in computational literary studies. It is based on annotations of 17 dramatic texts resulting in 11,939 annotations which were used as training material to fine-tune a German language BERT model that achieves an average accuracy of 73% for the single-label emotion classification of fourteen emotion types in cross-validation. We apply the emotion classification on a corpus of 141 comedies and 92 tragedies to compare these genres. For tragedies, the mean proportion percentages of ‘suffering’ and ‘abhorrence’ are higher than for comedies. Inversely, mean percentages of ‘anger’ and ‘joy’ are higher for comedies than for tragedies. A new finding is the surprisingly high proportion of ‘anger’ in comedies. Emotion distribution of the last scenes in dramatic texts also proves the quality of the classified data in terms of literary studies. In addition, the emotion distribution for several subgenres of comedy is investigated including non-canonical works of wide circulation which reached the recipients directly through the depicted emotions in the Kasperl Plays. Comedies from 1740 to 1770 are characterized by a pairing of higher amounts of ‘friendship’ and ‘love’. Satirical comedies from the same period stand out due to high rates of ‘anger’ as well as ‘suffering’. The very successful Kasperl plays turn out to be characterized by a comparatively large percentage of ‘schadenfreude’ and ‘joy’

    DARIAH and the Benelux

    Get PDF

    Essays on the economics of social norms and identity

    Get PDF

    “You’re trolling because
” – A Corpus-based Study of Perceived Trolling and Motive Attribution in the Comment Threads of Three British Political Blogs

    Get PDF
    This paper investigates the linguistically marked motives that participants attribute to those they call trolls in 991 comment threads of three British political blogs. The study is concerned with how these motives affect the discursive construction of trolling and trolls. Another goal of the paper is to examine whether the mainly emotional motives ascribed to trolls in the academic literature correspond with those that the participants attribute to the alleged trolls in the analysed threads. The paper identifies five broad motives ascribed to trolls: emotional/mental health-related/social reasons, financial gain, political beliefs, being employed by a political body, and unspecified political affiliation. It also points out that depending on these motives, trolling and trolls are constructed in various ways. Finally, the study argues that participants attribute motives to trolls not only to explain their behaviour but also to insult them

    Digitale Infrastrukturen fĂŒr die germanistische Forschung

    Get PDF
    Modern research in linguistics is increasingly reliant on digital infrastructure and information systems. This development began at the turn of the millennium and has since accelerated. The volume examines national and European infrastructure networks and the range of language resources in German linguistics that can be discovered, disclosed, and re-applied through digital infrastructure

    Proceedings of the Workshop on Challenges in the Management of Large Corpora (CMLC-10)

    Full text link

    Digitale Infrastrukturen fĂŒr die germanistische Forschung

    Get PDF
    • 

    corecore