10 research outputs found

    Argumentation Mining in Parliamentary Discourse

    Get PDF
    In parliamentary discourse, politicians expound their beliefs and goals through argumentation, and, to persuade the audience, they communicate their values by highlighting some aspect of an issue, an action which is commonly known as framing. The choices of frames are typically dependent upon the speakerโ€™s ideology. In this proposed doctoral work, we will computationally analyze framing strategies and present a model for discovering the latent structure of framing of real-world issues in Canadian parliamentary discourse

    Detection of Sarcasm and Nastiness: New Resources for Spanish Language

    Get PDF
    The main goal of this work is to provide the cognitive computing community with valuable resources to analyze and simulate the intentionality and/or emotions embedded in the language employed in social media. Specifically, it is focused on the Spanish language and online dialogues, leading to the creation of SOFOCO (Spanish Online Forums Corpus). It is the first Spanish corpus consisting of dialogic debates extracted from social media and it is annotated by means of crowdsourcing in order to carry out automatic analysis of subjective language forms, like sarcasm or nastiness. Furthermore, the annotators were also asked about the context need when taking a decision. In this way, the usersโ€™ intentions and their behavior inside social networks can be better understood and more accurate text analysis is possible. An analysis of the annotation results is carried out and the reliability of the annotations is also explored. Additionally, sarcasm and nastiness detection results (around 0.76 F-Measure in both cases) are also reported. The obtained results show the presented corpus as a valuable resource that might be used in very diverse future work.This study was partially funded by the Spanish Government (TIN2014-54288-C4-4-R and TIN2017-85854-C4-3-R) by the European Unionsโ€™s H2020 program under grant 769872 and by the National Science Foundation of USA (NSF CISE R1 #1202668

    Sketching the vision of the Web of Debates

    Get PDF
    The exchange of comments, opinions, and arguments in blogs, forums, social media, wikis, and review websites has transformed the Web into a modern agora, a virtual place where all types of debates take place. This wealth of information remains mostly unexploited: due to its textual form, such information is difficult to automatically process and analyse in order to validate, evaluate, compare, combine with other types of information and make it actionable. Recent research in Machine Learning, Natural Language Processing, and Computational Argumentation has provided some solutions, which still cannot fully capture important aspects of online debates, such as various forms of unsound reasoning, arguments that do not follow a standard structure, information that is not explicitly expressed, and non-logical argumentation methods. Tackling these challenges would give immense added-value, as it would allow searching for, navigating through and analyzing online opinions and arguments, obtaining a better picture of the various debates for a well-intentioned user. Ultimately, it may lead to increased participation of Web users in democratic, dialogical interchange of arguments, more informed decisions by professionals and decision-makers, as well as to an easier identification of biased, misleading, or deceptive arguments. This paper presents the vision of the Web of Debates, a more human-centered version of the Web, which aims to unlock the potential of the abundance of argumentative information that currently exists online, offering its users a new generation of argument-based web services and tools that are tailored to their real needs

    ํ•œ๊ตญ์–ด ํ…์ŠคํŠธ ๋…ผ์ฆ ๊ตฌ์กฐ์˜ ์ž๋™ ๋ถ„์„ ์—ฐ๊ตฌ

    Get PDF
    ํ•™์œ„๋…ผ๋ฌธ (์„์‚ฌ)-- ์„œ์šธ๋Œ€ํ•™๊ต ๋Œ€ํ•™์› : ์–ธ์–ดํ•™๊ณผ ์–ธ์–ดํ•™์ „๊ณต, 2016. 2. ์‹ ํšจํ•„.์ตœ๊ทผ ์˜จ๋ผ์ธ ํ…์ŠคํŠธ ์ž๋ฃŒ๋ฅผ ์ด์šฉํ•˜์—ฌ ๋Œ€์ค‘์˜ ์˜๊ฒฌ์„ ๋ถ„์„ํ•˜๋Š” ์ž‘์—…์ด ํ™œ๋ฐœํžˆ ์ด๋ฃจ์–ด์ง€๊ณ  ์žˆ๋‹ค. ์ด๋Ÿฌํ•œ ์ž‘์—…์—๋Š” ์ฃผ๊ด€์  ๋ฐฉํ–ฅ์„ฑ์„ ๊ฐ–๋Š” ํ…์ŠคํŠธ์˜ ๋…ผ์ฆ ๊ตฌ์กฐ์™€ ์ค‘์š” ๋‚ด์šฉ์„ ํŒŒ์•…ํ•˜๋Š” ๊ณผ์ •์ด ํ•„์š”ํ•˜๋ฉฐ, ์ž๋ฃŒ์˜ ์–‘๊ณผ ๋‹ค์–‘์„ฑ์ด ๊ธ‰๊ฒฉํžˆ ์ฆ๊ฐ€ํ•˜๋ฉด์„œ ๊ทธ ๊ณผ์ •์˜ ์ž๋™ํ™”๊ฐ€ ๋ถˆ๊ฐ€ํ”ผํ•ด์ง€๊ณ  ์žˆ๋‹ค. ๋ณธ ์—ฐ๊ตฌ์—์„œ๋Š” ์ •์ฑ…์— ๋Œ€ํ•œ ์ฐฌ๋ฐ˜ ์˜๊ฒฌ์œผ๋กœ ๊ตฌ์„ฑ๋œ ํ•œ๊ตญ์–ด ํ…์ŠคํŠธ ์ž๋ฃŒ๋ฅผ ์ง์ ‘ ๊ตฌ์ถ•ํ•˜๊ณ , ๊ธ€์„ ๊ตฌ์„ฑํ•˜๋Š” ๊ธฐ๋ณธ ๋‹จ์œ„๋“ค ์‚ฌ์ด์˜ ๋‹ดํ™” ๊ด€๊ณ„์˜ ์œ ํ˜•์„ ์ •์˜ํ•˜์˜€๋‹ค. ํ•˜๋‚˜์˜ ๋งฅ๋ฝ ์•ˆ์—์„œ ๋‘ ๊ฐœ์˜ ๋ฌธ์žฅ ํ˜น์€ ์ ˆ์ด ์„œ๋กœ ๊ด€๊ณ„๋ฅผ ๊ฐ–๋Š”์ง€, ๊ด€๊ณ„๋ฅผ ๊ฐ–๋Š”๋‹ค๋ฉด ์„œ๋กœ ๋™๋“ฑํ•œ ๊ด€๊ณ„์ธ์ง€, ๊ทธ๋ ‡์ง€ ์•Š์€ ๊ฒฝ์šฐ ์–ด๋Š ๋ฌธ์žฅ(์ ˆ)์ด ๋” ์ค‘์š”ํ•œ ๋ถ€๋ถ„์œผ๋กœ์„œ ๋‹ค๋ฅธ ํ•˜๋‚˜์˜ ์ง€์ง€๋ฅผ ๋ฐ›๋Š”์ง€์˜ ๊ธฐ์ค€์— ๋”ฐ๋ผ ๋‹ดํ™” ๊ด€๊ณ„๋ฅผ ๋‘ ๊ฐœ์˜ ์ธต์œ„๋กœ ๋‚˜๋ˆ„์–ด ์ด์šฉํ•˜์˜€๋‹ค. ์ด๋Ÿฌํ•œ ๊ธฐ๋ณธ ๋‹จ์œ„๋“ค ์‚ฌ์ด์˜ ๊ด€๊ณ„๋Š” ๊ธฐ๊ณ„ ํ•™์Šต๊ณผ ๊ทœ์น™ ๊ธฐ๋ฐ˜ ๋ฐฉ์‹์„ ์ด์šฉํ•˜์—ฌ ์˜ˆ์ธก๋œ๋‹ค. ์ด ๋•Œ ๊ฐ ๊ธ€์˜ ์ €์ž๊ฐ€ ํ‘œํ˜„ํ•˜๊ณ ์ž ํ•˜๋Š” ์˜๋„, ์ž์‹ ์˜ ์ฃผ์žฅ์„ ๋’ท๋ฐ›์นจํ•˜๊ธฐ ์œ„ํ•ด ์ œ์‹œํ•˜๋Š” ๊ทผ๊ฑฐ์˜ ์ข…๋ฅ˜, ๊ทธ๋ฆฌ๊ณ  ๊ทธ ๊ทผ๊ฑฐ๋ฅผ ์ด๋ฃจ๋Š” ๋…ผ์ฆ ์ „๋žต ๋“ฑ์ด ํ…์ŠคํŠธ์˜ ์–ธ์–ด์  ํŠน์ง•๊ณผ ํ•จ๊ป˜ ์ค‘์š”ํ•œ ์ž์งˆ๋กœ ์ž‘์šฉ๋œ๋‹ค. ๋…ผ์ฆ์˜ ์ „๋žต์œผ๋กœ๋Š” ์˜ˆ์‹œ, ์ธ๊ณผ, ์„ธ๋ถ€ ์‚ฌํ•ญ์— ๋Œ€ํ•œ ์„ค๋ช…, ๋ฐ˜๋ณต ์„œ์ˆ , ์ •์ •, ๋ฐฐ๊ฒฝ ์ง€์‹ ์ œ๊ณต ๋“ฑ์ด ๊ด€์ฐฐ๋˜์—ˆ๋‹ค. ์ด๋“ค ์„ธ๋ถ€ ๋ถ„๋ฅ˜๋Š” ๋‹ดํ™” ๊ด€๊ณ„์˜ ๋Œ€๋ถ„๋ฅ˜๋ฅผ ๊ตฌ์„ฑํ•˜๊ณ , ๊ทธ ๋‹ดํ™” ๊ด€๊ณ„๋ฅผ ์˜ˆ์ธกํ•˜๋Š” ๋ฐ ์“ฐ์ด๋Š” ์ž์งˆ์˜ ๊ธฐ๋ฐ˜์ด ๋˜์—ˆ๋‹ค. ๋˜ํ•œ ์ผ๋ถ€ ์–ธ์–ด์  ์ž์งˆ๋“ค์€ ๊ธฐ์กด ์—ฐ๊ตฌ๋ฅผ ์ฐธ๊ณ ํ•˜์—ฌ ํ•œ๊ตญ์–ด ์ž๋ฃŒ์— ์ ์šฉํ•  ์ˆ˜ ์žˆ๋Š” ํ˜•ํƒœ๋กœ ์žฌ๊ตฌ์„ฑํ•˜์˜€๋‹ค. ์ด๋ฅผ ์ด์šฉํ•˜์—ฌ ํ•œ๊ตญ์–ด ์ฝ”ํผ์Šค๋ฅผ ๊ตฌ์ถ•ํ•˜๊ณ  ํ•œ๊ตญ์–ด ์—ฐ๊ตฌ์— ํŠนํ™”๋œ ์ ‘์†์‚ฌ ๋ฐ ์—ฐ๊ฒฐ์–ด์˜ ๋ชฉ๋ก์„ ๊ตฌ์„ฑํ•˜์—ฌ ์ž์งˆ ๋ชฉ๋ก์— ํฌํ•จ์‹œ์ผฐ๋‹ค. ์ด๋Ÿฌํ•œ ์ž์งˆ๋“ค์— ๊ธฐ๋ฐ˜ํ•ด์„œ ๋‹ดํ™” ๊ด€๊ณ„๋ฅผ ์˜ˆ์ธกํ•˜๋Š” ๊ณผ์ •์„ ์ด ์—ฐ๊ตฌ์—์„œ ๋…์ž์ ์ธ ๋ชจ๋ธ๋กœ์„œ ์ž๋™ํ™”ํ•˜์—ฌ ์ œ์•ˆํ•˜์˜€๋‹ค. ์˜ˆ์ธก ์‹คํ—˜์˜ ๊ฒฐ๊ณผ๋ฅผ ๋ณด๋ฉด ๋ณธ ์—ฐ๊ตฌ์—์„œ ์ •์˜ํ•˜์—ฌ ์ด์šฉํ•œ ์ž์งˆ๋“ค์€ ๊ธ์ •์ ์ธ ์ƒํ˜ธ ์ž‘์šฉ์„ ํ†ตํ•ด ๋‹ดํ™” ๊ด€๊ณ„ ์˜ˆ์ธก์˜ ์„ฑ๋Šฅ์„ ํ–ฅ์ƒ์‹œํ‚จ๋‹ค๋Š” ๊ฒƒ์„ ์•Œ ์ˆ˜ ์žˆ์—ˆ๋‹ค. ๊ทธ ์ค‘์—์„œ๋„ ์ผ๋ถ€ ์ ‘์†์‚ฌ ๋ฐ ์—ฐ๊ฒฐ์–ด, ๋ฌธ์žฅ ์„ฑ๋ถ„์˜ ์œ ๋ฌด์— ๋”ฐ๋ฅธ ์˜์กด์ ์ธ ๋ฌธ์žฅ ๊ตฌ์กฐ, ๊ทธ๋ฆฌ๊ณ  ๊ฐ™์€ ๋‚ด์šฉ์„ ๋ฐ˜๋ณต ์„œ์ˆ ํ•˜๋Š”์ง€์˜ ์—ฌ๋ถ€ ๋“ฑ์ด ํŠนํžˆ ์˜ˆ์ธก์— ๊ธฐ์—ฌํ•˜์˜€๋‹ค. ํ…์ŠคํŠธ๋ฅผ ์ด๋ฃจ๋Š” ๊ธฐ๋ณธ ๋‹จ์œ„๋“ค ์‚ฌ์ด์— ์กด์žฌํ•˜๋Š” ๋‹ดํ™” ๊ด€๊ณ„๋“ค์€ ์„œ๋กœ ์—ฐ๊ฒฐ, ํ•ฉ์„ฑ๋˜์–ด ํ…์ŠคํŠธ ์ „์ฒด์— ๋Œ€์‘๋˜๋Š” ํŠธ๋ฆฌ ํ˜•ํƒœ์˜ ๋…ผ์ฆ ๊ตฌ์กฐ๋ฅผ ์ด๋ฃฌ๋‹ค. ์ด๋ ‡๊ฒŒ ์–ป์€ ๋…ผ์ฆ ๊ตฌ์กฐ์— ๋Œ€ํ•ด์„œ๋Š”, ํŠธ๋ฆฌ์˜ ๊ฐ€์žฅ ์œ„์ชฝ์ธ ๋ฃจํŠธ ๋…ธ๋“œ์— ๊ธ€์˜ ์ฃผ์ œ๋ฌธ์ด ์œ„์น˜ํ•˜๊ณ , ๊ทธ ๋ฐ”๋กœ ์•„๋ž˜ ์ธต์œ„์— ํ•ด๋‹นํ•˜๋Š” ๋ฌธ์žฅ(์ ˆ)๋“ค์ด ๊ทผ๊ฑฐ๋กœ์„œ ๊ฐ€์žฅ ์ค‘์š”ํ•œ ๋‚ด์šฉ์„ ๋‹ด๊ณ  ์žˆ๋‹ค๊ณ  ๊ฐ€์ •ํ•  ์ˆ˜ ์žˆ๋‹ค. ๋”ฐ๋ผ์„œ ์ฃผ์ œ๋ฌธ์„ ์ง์ ‘์ ์œผ๋กœ ๋’ท๋ฐ›์นจํ•˜๋Š” ๋ฌธ์žฅ(์ ˆ)์„ ์ถ”์ถœํ•˜๋ฉด ๊ธ€์˜ ์ค‘์š” ๋‚ด์šฉ์„ ์–ป๊ฒŒ ๋œ๋‹ค. ์ด๋Š” ๊ณง ํ…์ŠคํŠธ ์š”์•ฝ ์ž‘์—…์—์„œ ์œ ์šฉํ•˜๊ฒŒ ์“ฐ์ด๋Š” ๋ฐฉ์‹์ด ๋  ์ˆ˜ ์žˆ๋‹ค. ๋˜ํ•œ ์ฃผ์ œ์— ๋”ฐ๋ฅธ ์ž…์žฅ ๋ถ„๋ฅ˜๋‚˜ ๊ทผ๊ฑฐ ์ˆ˜์ง‘ ๋“ฑ ๋‹ค์–‘ํ•œ ๋ถ„์•ผ์—์„œ๋„ ์‘์šฉ์ด ๊ฐ€๋Šฅํ•  ๊ฒƒ์ด๋‹ค.These days, there is an increased need to analyze mass opinions using on-line text data. These tasks need to recognize the argumentation schemes and main contents of subjective, argumentative writing, and the automatization of the required procedures is becoming indispensable. This thesis constructed the text data using Korean debates on certain political issues, and defined the types of discourse relations between basic units of text segments. The discourse relations are classified into two levels and four subclasses, according to the standards which determine whether the two segments are related to each other in a context, whether the relation is coordinating or subordinating, and which of the two units in a pair is supported by the other as a more important part. The relations between basic text units are predicted based on machine learning and rule-based methods. The features for the prediction of discourse relations include what the author of a text wants to claim and argumentative strategies comprising grounds for the author's claim, using linguistic properties shown in texts. The strategies for argument are observed and subcategorized into Providing Examples, Cause-and-Effects, Explanations in Detail, Restatements, Contrasts, Background Knowledge, and more. These subclasses compose a broader class of discourse relations and became the basis for features used during the classification of the relations. Some linguistic features refer to those of previous studies, they are reconstituted in a revised form which is more appropriate for Korean data. Thus, this study constructed a Korean debate corpus and a list of connectives specialized to deal with Korean texts to include in the experiment features. The automated prediction of discourse relations based on those features is suggested in this study as a unique model of argument mining. According to the results of experiments predicting discourse relations, the features defined and used in this study are observed to improve the performance of prediction tasks through positive interactions with each other. In particular, some explicit connectives, dependent sentence structures based on lack of certain components, and whether the same meanings are restated clearly contributed to the classification tasks. The discourse relations between basic text units are related and combined with each other to comprise a tree-form argumentation structure for the overall document. Regarding the argumentation structure, the topic sentence of the document is located at the root node in the tree, and it is assumed that the nodes of sentences or clauses right below the root node contain the most important contents as grounds for the topic unit. Therefore, extraction of the text segments directly supporting the topic sentence may help in obtaining the important contents in each document. This can be one of the useful methods in text summarization. Additionally, applications to various fields may also be possible, including stance classification of debate texts, extraction of grounds for certain topics, and so on.1 Introduction 1 1.1 Purposes 1 1.1.1 A Study of Korean Texts with Linguistic Cues 1 1.1.2 Detection of Argumentation Schemes in Debate Texts 2 1.1.3 Extraction of Important Content in Argumentation Schemes of Texts 2 1.2 Structure 3 2 Previous Work 5 2.1 Argumentation Mining Tasks 7 2.1.1 Argument Elements 7 2.1.2 Argumentation Schemes 9 2.2 Argumentation Schemes in Various Texts 14 2.2.1 Dialogic vs. Monologic Texts 14 2.2.2 Debate Texts vs. Other Texts 15 2.2.3 Studies in Other Languages 17 2.3 Theoretical Basis 18 2.3.1 Argumentation Theory 18 2.3.2 Discourse Theory 21 3 Identifying Argumentation Schemes in Debate Texts 25 3.1 Data Description 25 3.2 Basic Units 27 3.3 Discourse Relations 29 3.3.1 Strategies for Proving a Claim 29 3.3.2 Definition 35 4 Automatic Identification of Argumentation Schemes 41 4.1 Annotation 41 4.2 Baseline 46 4.3 Proposed Model 50 4.3.1 O vs. X Classification 51 4.3.2 Convergent Relation Rule 61 4.3.3 NN vs. NS vs. SN Classification 65 4.4 Evaluation 67 4.4.1 Measures 67 4.4.2 Results 68 4.5 Discussion 74 4.6 A Pilot Study on English Texts 81 5 Detecting Important Units 87 6 Conclusion 99 Bibliography 103 ์ดˆ๋ก 117Maste

    Using Summarization to Discover Argument Facets in Online Idealogical Dialog

    No full text
    corecore