2,407 research outputs found

    ANALYSIS OF IDIOMATIC EMOTION EXPRESSIONS DETECTED FROM ONLINE MOVIE REVIEWS

    Get PDF
    A large number of idiomatic emotion expressions in Korean are composed of certain nouns of human body parts accompanied by selected predicates, which represent a โ€˜physiological metonymyโ€™ of sentiment (Lakoff 1987, Ungerer & Schmid 1996)or instance, kasum-i ttwita literally means a physiological reaction (i.e. oneโ€™s heart beat) but also can represent the emotion like being thrilled to bits. We compared idiomatic emotion expressions used in English online movie reviews and those observed in Korean, and noticed that the nouns of body parts such as kasum โ€˜heartโ€™, maum โ€˜mindโ€™ or nwun โ€˜eyesโ€™ emerge frequently in both languages, whereas ekkay โ€˜shoulderโ€™, kancang โ€˜intestinesโ€™ or ppye โ€˜bonesโ€™ seem to be rather reserved for Korean emotion expressions. In this study, we extract idiomatic emotion expressions based on the 13 nouns of body parts listed by Lim (2001) from Korean online movie reviews. For instance, nouns such as meli โ€˜headโ€™, ip โ€˜mouthโ€™ or simcang โ€˜cardiaโ€™ are frequently used for constituting the emotion expressions of POSITIVE values as shown in ip-ul tamwul-swu epsta โ€˜be with open mouth (with delight) these nouns hardly occur in NEGATIVE emotion expressions, which is not predictable from their semantic features, but reveals their lexical idiosyncrasy. The frequent emotion expressions observed in online movie reviews will be analyzed and classified according to their semantic properties. We will show what salient traits of Korean emotion expressions can be remarked in current online subjective documents such as usersโ€™ reviews, blogs or opinion texts

    Emotions in the face: biology or culture? โ€“ Using idiomatic constructions as indirect evidence to inform a psychological research controversy

    Get PDF
    Research on the facial expression of emotions has become a bone of contention in psychological research. On the one hand, Ekman and his colleagues have argued for a universal set of six basic emotions that are recognized with a considerable degree of accuracy across cultures and automatically displayed in highly similar ways by people. On the other hand, more recent research in cognitive science has provided results that are supportive of a cultural-relativist position. In this paper this controversy is approached from a contrastive perspective on phraseological constructions. It focuses on how emotional displays are codified in somatic idioms in some European (English, German, French, Spanish) and East Asian (Japanese, Korean, Chinese [Cantonese]) languages. Using somatic idioms such as make big eyes or die Nase rรผmpfen as a pool of evidence to shed linguistic light on the psychological controversy, the paper engages with the following general research question: Is there a significant difference between European and East Asian somatic idioms or do these constructions rather speak for a universal apprehension of facial emotion displays? To answer this question, the paper compares somatic expressions that are selected from (idiom) dictionaries of the languages listed above. Moreover, native speakers of the East Asian languages were consulted to support the analysis of the respective data. All corresponding entries were analysed categorically, i. e. with regard to whether or not they encode a given facial area to denote a specific emotion. The results show arguments both for and against the universalist and the cultural-relativist positions. In general, they speak for an opportunistic encoding of facial emotion displays

    Argumentation Mining in User-Generated Web Discourse

    Full text link
    The goal of argumentation mining, an evolving research field in computational linguistics, is to design methods capable of analyzing people's argumentation. In this article, we go beyond the state of the art in several ways. (i) We deal with actual Web data and take up the challenges given by the variety of registers, multiple domains, and unrestricted noisy user-generated Web discourse. (ii) We bridge the gap between normative argumentation theories and argumentation phenomena encountered in actual data by adapting an argumentation model tested in an extensive annotation study. (iii) We create a new gold standard corpus (90k tokens in 340 documents) and experiment with several machine learning methods to identify argument components. We offer the data, source codes, and annotation guidelines to the community under free licenses. Our findings show that argumentation mining in user-generated Web discourse is a feasible but challenging task.Comment: Cite as: Habernal, I. & Gurevych, I. (2017). Argumentation Mining in User-Generated Web Discourse. Computational Linguistics 43(1), pp. 125-17

    Sentiment Polarity Classification of Comments on Korean News Articles Using Feature Reweighting

    Get PDF
    ์ผ๋ฐ˜์ ์œผ๋กœ ์ธํ„ฐ๋„ท ์‹ ๋ฌธ ๊ธฐ์‚ฌ์— ๋Œ€ํ•œ ๋Œ“๊ธ€์€ ๊ทธ ์‹ ๋ฌธ ๊ธฐ์‚ฌ์— ๋Œ€ํ•œ ์ฃผ๊ด€์ ์ธ ๊ฐ์ •์ด๋‚˜ ์˜๊ฒฌ์„ ํฌํ•จํ•˜๊ณ  ์žˆ๋‹ค. ๋”ฐ๋ผ์„œ ์ด๋Ÿฐ ์‹ ๋ฌธ ๊ธฐ์‚ฌ์˜ ๋Œ“๊ธ€์— ๋Œ€ํ•œ ๊ฐ์ •์„ ์ธ์‹ํ•˜๊ณ  ๋ถ„๋ฅ˜ํ•˜๋Š” ๋ฐ์—๋Š” ๊ทธ ์‹ ๋ฌธ ๊ธฐ์‚ฌ์˜ ์›๋ฌธ ๋‚ด์šฉ์ด ์ค‘์š”ํ•œ ์˜ํ–ฅ์„ ๋ฏธ์นœ๋‹ค. ์ด๋Ÿฐ ์ ์— ์ฐฉ์•ˆํ•˜์—ฌ ๋ณธ ๋…ผ๋ฌธ์€ ๊ธฐ์‚ฌ์˜ ์›๋ฌธ ๋‚ด์šฉ๊ณผ ๊ฐ์ • ์‚ฌ์ „์„ ์ด์šฉํ•˜๋Š” ๊ฐ€์ค‘์น˜ ์กฐ์ • ๋ฐฉ๋ฒ•์„ ์ œ์•ˆํ•˜๊ณ , ์ œ์•ˆ๋œ ๊ฐ€์ค‘์น˜ ์กฐ์ • ๋ฐฉ๋ฒ•์„ ์ด์šฉํ•ด์„œ ํ•œ๊ตญ์–ด ์‹ ๋ฌธ ๊ธฐ์‚ฌ์˜ ๋Œ“๊ธ€์— ๋Œ€ํ•œ ๊ฐ์ • ์ด์ง„ ๋ถ„๋ฅ˜ ๋ฐฉ๋ฒ•์„ ์ œ์•ˆํ•œ๋‹ค. ๊ฐ€์ค‘์น˜ ์กฐ์ • ๋ฐฉ๋ฒ•์—๋Š” ๋‹ค์–‘ํ•œ ์ž์งˆ ์ง‘ํ•ฉ์ด ์‚ฌ์šฉ๋˜๋Š”๋ฐ ๊ทธ๊ฒƒ์€ ๋Œ“๊ธ€์— ํฌํ•จ๋œ ๊ฐ์ • ๋‹จ์–ด, ๊ทธ๋ฆฌ๊ณ  ๊ฐ์ • ์‚ฌ์ „๊ณผ ๋‰ด์Šค ๊ธฐ์‚ฌ์˜ ๋ณธ๋ฌธ์— ๊ด€๋ จ๋œ ์ž์งˆ๋“ค, ๋งˆ์ง€๋ง‰์œผ๋กœ ๋‰ด์Šค ๊ธฐ์‚ฌ์˜ ์นดํ…Œ๊ณ ๋ฆฌ ์ •๋ณด๊ฐ€ ํฌํ•จ๋˜์–ด ์žˆ๋‹ค. ์—ฌ๊ธฐ์„œ ๋งํ•˜๋Š” ๊ฐ์ • ์‚ฌ์ „์€ ํ•œ๊ตญ์–ด ๊ฐ์ • ์‚ฌ์ „์„ ์˜๋ฏธํ•˜๋ฉฐ ์•„์ง ๊ณต๊ฐœ๋œ ๊ฒƒ์ด ์—†๊ธฐ ๋•Œ๋ฌธ์—, ๊ธฐ์กด์— ์žˆ๋Š” ์˜์–ด ๊ฐ์ • ์‚ฌ์ „์„ ์ด์šฉํ•˜์—ฌ ๊ตฌ์ถ•ํ•˜์˜€๋‹ค. ๋ณธ ๋…ผ๋ฌธ์—์„œ ์ œ์•ˆ๋œ ๊ฐ์ • ์ด์ง„ ๋ถ„๋ฅ˜๋Š” ๊ธฐ๊ณ„ ํ•™์Šต์„ ์ด์šฉํ•œ๋‹ค. ์ผ๋ฐ˜์ ์œผ๋กœ ๊ธฐ๊ณ„ ํ•™์Šต์„ ์œ„ํ•ด์„œ๋Š” ํ•™์Šต ๋ง๋ญ‰์น˜๊ฐ€ ํ•„์š”ํ•œ๋ฐ ํŠน๋ณ„ํžˆ ๊ฐ์ • ๋ถ„๋ฅ˜ ๋ฌธ์ œ์—์„œ๋Š” ๊ธ์ • ํ˜น์€ ๋ถ€์ • ๊ฐ์ • ํƒœ๊ทธ๊ฐ€ ๋ถ€์ฐฉ๋œ ๋ง๋ญ‰์น˜๊ฐ€ ํ•„์š”ํ•˜๋‹ค. ์ด ๋ง๋ญ‰์น˜์˜ ๊ฒฝ์šฐ๋„, ๊ณต๊ฐœ๋œ ํ•œ๊ตญ์–ด ๊ฐ์ • ๋ง๋ญ‰์น˜๊ฐ€ ์•„์ง ์—†๊ธฐ ๋•Œ๋ฌธ์— ๋ง๋ญ‰์น˜๋ฅผ ์ง์ ‘ ๊ตฌ์ถ•ํ•˜์˜€๋‹ค. ์‚ฌ์šฉ๋œ ๊ธฐ๊ณ„ ํ•™์Šต ๋ฐฉ๋ฒ•์œผ๋กœ๋Š” Na&iumlve Bayes, k-NN, SVM์ด ์žˆ๊ณ , ์ž์งˆ ์„ ํƒ ๋ฐฉ๋ฒ•์œผ๋กœ๋Š” Document Frequency, ฯ‡^2 statistic, Information Gain์ด ์žˆ๋‹ค. ๊ทธ ๊ฒฐ๊ณผ, ๋Œ“๊ธ€ ์•ˆ์— ํฌํ•จ๋œ ๊ฐ์ • ๋‹จ์–ด์™€ ๊ทธ ๋Œ“๊ธ€์— ๋Œ€ํ•œ ๊ธฐ์‚ฌ ๋ณธ๋ฌธ์ด ๊ฐ์ • ๋ถ„๋ฅ˜์— ๋งค์šฐ ํšจ๊ณผ์ ์ธ ์ž์งˆ์ž„์„ ํ™•์ธํ•  ์ˆ˜ ์žˆ์—ˆ๋‹ค.Chapter 1 Introduction 1 Chapter 2 Related Works 4 2.1 Sentiment Classification 4 2.2 Feature Weighting in Vector Space Model 5 2.3 Feature Extraction and Selection 7 2.4 Classifiers 10 2.5 Accuracy Measures 14 Chapter 3 Feature Reweighting 16 3.1 Feature extraction in Korean 16 3.2 Feature Reweighting Methods 17 3.3 Examples of Feature Reweighting Methods 18 Chapter 4 Sentiment Polarity Classification System 21 4.1 Model Generation 21 4.2 Sentiment Polarity Classification 23 Chapter 5 Data Preparation 25 5.1 Korean Sentiment Corpus 25 5.2 Korean Sentiment Lexicon 27 Chapter 6 Experiments 29 6.1 Experimental Environment 29 6.2 Experimental Results 30 Chapter 7 Conclusions and Future Works 38 Bibliography 40 Acknowledgments 4

    Annotation Scheme for Constructing Sentiment Corpus in Korean

    Get PDF

    Using High Dimensional Computing on Arabic Language Speech to Text Classification

    Get PDF
    High-Dimensional Processing is the idea that mind register illustrations of neural activities which are not immediately related with numbers. The objective of the article is hyper- dimensional computation of data for categorization of text from two distinct speech datasets, namely the Arabic Corpus dataset and the MediaSpeech dataset with four languages (Arabic, Spanish, French, and Turkish). Through the use of an n-gram encoding scheme, hyper dimensional computing is used to conduct the analysis from the prior set of data. Using hyper dimensional computing, the MediaSpeech dataset accomplishes 100% accuracy for all 4-gram to 14-gram encoding schemes, while the Arabic Corpus dataset accomplishes 100% accuracy for 4-gram to 7-gram encoding schemes
    • โ€ฆ
    corecore