2,407 research outputs found
ANALYSIS OF IDIOMATIC EMOTION EXPRESSIONS DETECTED FROM ONLINE MOVIE REVIEWS
A large number of idiomatic emotion expressions in Korean are composed of certain nouns
of human body parts accompanied by selected predicates, which represent a โphysiological
metonymyโ of sentiment (Lakoff 1987, Ungerer & Schmid 1996)or instance, kasum-i ttwita
literally means a physiological reaction (i.e. oneโs heart beat) but also can represent the emotion
like being thrilled to bits. We compared idiomatic emotion expressions used in English online movie
reviews and those observed in Korean, and noticed that the nouns of body parts such as kasum
โheartโ, maum โmindโ or nwun โeyesโ emerge frequently in both languages, whereas ekkay
โshoulderโ, kancang โintestinesโ or ppye โbonesโ seem to be rather reserved for Korean emotion
expressions.
In this study, we extract idiomatic emotion expressions based on the 13 nouns of body parts
listed by Lim (2001) from Korean online movie reviews. For instance, nouns such as meli โheadโ, ip
โmouthโ or simcang โcardiaโ are frequently used for constituting the emotion expressions of
POSITIVE values as shown in ip-ul tamwul-swu epsta โbe with open mouth (with delight) these
nouns hardly occur in NEGATIVE emotion expressions, which is not predictable from their semantic
features, but reveals their lexical idiosyncrasy. The frequent emotion expressions observed in online
movie reviews will be analyzed and classified according to their semantic properties. We will show
what salient traits of Korean emotion expressions can be remarked in current online subjective
documents such as usersโ reviews, blogs or opinion texts
Emotions in the face: biology or culture? โ Using idiomatic constructions as indirect evidence to inform a psychological research controversy
Research on the facial expression of emotions has become a bone of contention in psychological research. On the one hand, Ekman and his colleagues have argued for a universal set of six basic emotions that are recognized with a considerable degree of accuracy across cultures and automatically displayed in highly similar ways by people. On the other hand, more recent research in cognitive science has provided results that are supportive of a cultural-relativist position. In this paper this controversy is approached from a contrastive perspective on phraseological constructions. It focuses on how emotional displays are codified in somatic idioms in some European (English, German, French, Spanish) and East Asian (Japanese, Korean, Chinese [Cantonese]) languages. Using somatic idioms such as make big eyes or die Nase rรผmpfen as a pool of evidence to shed linguistic light on the psychological controversy, the paper engages with the following general research question: Is there a significant difference between European and East Asian somatic idioms or do these constructions rather speak for a universal apprehension of facial emotion displays? To answer this question, the paper compares somatic expressions that are selected from (idiom) dictionaries of the languages listed above. Moreover, native speakers of the East Asian languages were consulted to support the analysis of the respective data. All corresponding entries were analysed categorically, i. e. with regard to whether or not they encode a given facial area to denote a specific emotion. The results show arguments both for and against the universalist and the cultural-relativist positions. In general, they speak for an opportunistic encoding of facial emotion displays
Argumentation Mining in User-Generated Web Discourse
The goal of argumentation mining, an evolving research field in computational
linguistics, is to design methods capable of analyzing people's argumentation.
In this article, we go beyond the state of the art in several ways. (i) We deal
with actual Web data and take up the challenges given by the variety of
registers, multiple domains, and unrestricted noisy user-generated Web
discourse. (ii) We bridge the gap between normative argumentation theories and
argumentation phenomena encountered in actual data by adapting an argumentation
model tested in an extensive annotation study. (iii) We create a new gold
standard corpus (90k tokens in 340 documents) and experiment with several
machine learning methods to identify argument components. We offer the data,
source codes, and annotation guidelines to the community under free licenses.
Our findings show that argumentation mining in user-generated Web discourse is
a feasible but challenging task.Comment: Cite as: Habernal, I. & Gurevych, I. (2017). Argumentation Mining in
User-Generated Web Discourse. Computational Linguistics 43(1), pp. 125-17
Sentiment Polarity Classification of Comments on Korean News Articles Using Feature Reweighting
์ผ๋ฐ์ ์ผ๋ก ์ธํฐ๋ท ์ ๋ฌธ ๊ธฐ์ฌ์ ๋ํ ๋๊ธ์ ๊ทธ ์ ๋ฌธ ๊ธฐ์ฌ์ ๋ํ ์ฃผ๊ด์ ์ธ ๊ฐ์ ์ด๋ ์๊ฒฌ์ ํฌํจํ๊ณ ์๋ค. ๋ฐ๋ผ์ ์ด๋ฐ ์ ๋ฌธ ๊ธฐ์ฌ์ ๋๊ธ์ ๋ํ ๊ฐ์ ์ ์ธ์ํ๊ณ ๋ถ๋ฅํ๋ ๋ฐ์๋ ๊ทธ ์ ๋ฌธ ๊ธฐ์ฌ์ ์๋ฌธ ๋ด์ฉ์ด ์ค์ํ ์ํฅ์ ๋ฏธ์น๋ค. ์ด๋ฐ ์ ์ ์ฐฉ์ํ์ฌ ๋ณธ ๋
ผ๋ฌธ์ ๊ธฐ์ฌ์ ์๋ฌธ ๋ด์ฉ๊ณผ ๊ฐ์ ์ฌ์ ์ ์ด์ฉํ๋ ๊ฐ์ค์น ์กฐ์ ๋ฐฉ๋ฒ์ ์ ์ํ๊ณ , ์ ์๋ ๊ฐ์ค์น ์กฐ์ ๋ฐฉ๋ฒ์ ์ด์ฉํด์ ํ๊ตญ์ด ์ ๋ฌธ ๊ธฐ์ฌ์ ๋๊ธ์ ๋ํ ๊ฐ์ ์ด์ง ๋ถ๋ฅ ๋ฐฉ๋ฒ์ ์ ์ํ๋ค.
๊ฐ์ค์น ์กฐ์ ๋ฐฉ๋ฒ์๋ ๋ค์ํ ์์ง ์งํฉ์ด ์ฌ์ฉ๋๋๋ฐ ๊ทธ๊ฒ์ ๋๊ธ์ ํฌํจ๋ ๊ฐ์ ๋จ์ด, ๊ทธ๋ฆฌ๊ณ ๊ฐ์ ์ฌ์ ๊ณผ ๋ด์ค ๊ธฐ์ฌ์ ๋ณธ๋ฌธ์ ๊ด๋ จ๋ ์์ง๋ค, ๋ง์ง๋ง์ผ๋ก ๋ด์ค ๊ธฐ์ฌ์ ์นดํ
๊ณ ๋ฆฌ ์ ๋ณด๊ฐ ํฌํจ๋์ด ์๋ค. ์ฌ๊ธฐ์ ๋งํ๋ ๊ฐ์ ์ฌ์ ์ ํ๊ตญ์ด ๊ฐ์ ์ฌ์ ์ ์๋ฏธํ๋ฉฐ ์์ง ๊ณต๊ฐ๋ ๊ฒ์ด ์๊ธฐ ๋๋ฌธ์, ๊ธฐ์กด์ ์๋ ์์ด ๊ฐ์ ์ฌ์ ์ ์ด์ฉํ์ฌ ๊ตฌ์ถํ์๋ค.
๋ณธ ๋
ผ๋ฌธ์์ ์ ์๋ ๊ฐ์ ์ด์ง ๋ถ๋ฅ๋ ๊ธฐ๊ณ ํ์ต์ ์ด์ฉํ๋ค. ์ผ๋ฐ์ ์ผ๋ก ๊ธฐ๊ณ ํ์ต์ ์ํด์๋ ํ์ต ๋ง๋ญ์น๊ฐ ํ์ํ๋ฐ ํน๋ณํ ๊ฐ์ ๋ถ๋ฅ ๋ฌธ์ ์์๋ ๊ธ์ ํน์ ๋ถ์ ๊ฐ์ ํ๊ทธ๊ฐ ๋ถ์ฐฉ๋ ๋ง๋ญ์น๊ฐ ํ์ํ๋ค. ์ด ๋ง๋ญ์น์ ๊ฒฝ์ฐ๋, ๊ณต๊ฐ๋ ํ๊ตญ์ด ๊ฐ์ ๋ง๋ญ์น๊ฐ ์์ง ์๊ธฐ ๋๋ฌธ์ ๋ง๋ญ์น๋ฅผ ์ง์ ๊ตฌ์ถํ์๋ค. ์ฌ์ฉ๋ ๊ธฐ๊ณ ํ์ต ๋ฐฉ๋ฒ์ผ๋ก๋ Naïve Bayes, k-NN, SVM์ด ์๊ณ , ์์ง ์ ํ ๋ฐฉ๋ฒ์ผ๋ก๋ Document Frequency, ฯ^2 statistic, Information Gain์ด ์๋ค.
๊ทธ ๊ฒฐ๊ณผ, ๋๊ธ ์์ ํฌํจ๋ ๊ฐ์ ๋จ์ด์ ๊ทธ ๋๊ธ์ ๋ํ ๊ธฐ์ฌ ๋ณธ๋ฌธ์ด ๊ฐ์ ๋ถ๋ฅ์ ๋งค์ฐ ํจ๊ณผ์ ์ธ ์์ง์์ ํ์ธํ ์ ์์๋ค.Chapter 1 Introduction 1
Chapter 2 Related Works 4
2.1 Sentiment Classification 4
2.2 Feature Weighting in Vector Space Model 5
2.3 Feature Extraction and Selection 7
2.4 Classifiers 10
2.5 Accuracy Measures 14
Chapter 3 Feature Reweighting 16
3.1 Feature extraction in Korean 16
3.2 Feature Reweighting Methods 17
3.3 Examples of Feature Reweighting Methods 18
Chapter 4 Sentiment Polarity Classification System 21
4.1 Model Generation 21
4.2 Sentiment Polarity Classification 23
Chapter 5 Data Preparation 25
5.1 Korean Sentiment Corpus 25
5.2 Korean Sentiment Lexicon 27
Chapter 6 Experiments 29
6.1 Experimental Environment 29
6.2 Experimental Results 30
Chapter 7 Conclusions and Future Works 38
Bibliography 40
Acknowledgments 4
Using High Dimensional Computing on Arabic Language Speech to Text Classification
High-Dimensional Processing is the idea that mind register illustrations of neural activities which are not immediately related with numbers. The objective of the article is hyper- dimensional computation of data for categorization of text from two distinct speech datasets, namely the Arabic Corpus dataset and the MediaSpeech dataset with four languages (Arabic, Spanish, French, and Turkish). Through the use of an n-gram encoding scheme, hyper dimensional computing is used to conduct the analysis from the prior set of data. Using hyper dimensional computing, the MediaSpeech dataset accomplishes 100% accuracy for all 4-gram to 14-gram encoding schemes, while the Arabic Corpus dataset accomplishes 100% accuracy for 4-gram to 7-gram encoding schemes
- โฆ