1 research outputs found

    ν•œκ΅­μ–΄ ν…μŠ€νŠΈ 논증 ꡬ쑰의 μžλ™ 뢄석 연ꡬ

    Get PDF
    ν•™μœ„λ…Όλ¬Έ (석사)-- μ„œμšΈλŒ€ν•™κ΅ λŒ€ν•™μ› : μ–Έμ–΄ν•™κ³Ό 언어학전곡, 2016. 2. μ‹ νš¨ν•„.졜근 온라인 ν…μŠ€νŠΈ 자료λ₯Ό μ΄μš©ν•˜μ—¬ λŒ€μ€‘μ˜ μ˜κ²¬μ„ λΆ„μ„ν•˜λŠ” μž‘μ—…μ΄ ν™œλ°œνžˆ 이루어지고 μžˆλ‹€. μ΄λŸ¬ν•œ μž‘μ—…μ—λŠ” 주관적 λ°©ν–₯성을 κ°–λŠ” ν…μŠ€νŠΈμ˜ 논증 ꡬ쑰와 μ€‘μš” λ‚΄μš©μ„ νŒŒμ•…ν•˜λŠ” 과정이 ν•„μš”ν•˜λ©°, 자료의 μ–‘κ³Ό 닀양성이 κΈ‰κ²©νžˆ μ¦κ°€ν•˜λ©΄μ„œ κ·Έ κ³Όμ •μ˜ μžλ™ν™”κ°€ λΆˆκ°€ν”Όν•΄μ§€κ³  μžˆλ‹€. λ³Έ μ—°κ΅¬μ—μ„œλŠ” 정책에 λŒ€ν•œ 찬반 의견으둜 κ΅¬μ„±λœ ν•œκ΅­μ–΄ ν…μŠ€νŠΈ 자료λ₯Ό 직접 κ΅¬μΆ•ν•˜κ³ , 글을 κ΅¬μ„±ν•˜λŠ” κΈ°λ³Έ λ‹¨μœ„λ“€ μ‚¬μ΄μ˜ λ‹΄ν™” κ΄€κ³„μ˜ μœ ν˜•μ„ μ •μ˜ν•˜μ˜€λ‹€. ν•˜λ‚˜μ˜ λ§₯락 μ•ˆμ—μ„œ 두 개의 λ¬Έμž₯ ν˜Ήμ€ 절이 μ„œλ‘œ 관계λ₯Ό κ°–λŠ”μ§€, 관계λ₯Ό κ°–λŠ”λ‹€λ©΄ μ„œλ‘œ λ™λ“±ν•œ 관계인지, 그렇지 μ•Šμ€ 경우 μ–΄λŠ λ¬Έμž₯(절)이 더 μ€‘μš”ν•œ λΆ€λΆ„μœΌλ‘œμ„œ λ‹€λ₯Έ ν•˜λ‚˜μ˜ 지지λ₯Ό λ°›λŠ”μ§€μ˜ 기쀀에 따라 λ‹΄ν™” 관계λ₯Ό 두 개의 μΈ΅μœ„λ‘œ λ‚˜λˆ„μ–΄ μ΄μš©ν•˜μ˜€λ‹€. μ΄λŸ¬ν•œ κΈ°λ³Έ λ‹¨μœ„λ“€ μ‚¬μ΄μ˜ κ΄€κ³„λŠ” 기계 ν•™μŠ΅κ³Ό κ·œμΉ™ 기반 방식을 μ΄μš©ν•˜μ—¬ μ˜ˆμΈ‘λœλ‹€. 이 λ•Œ 각 κΈ€μ˜ μ €μžκ°€ ν‘œν˜„ν•˜κ³ μž ν•˜λŠ” μ˜λ„, μžμ‹ μ˜ μ£Όμž₯을 λ’·λ°›μΉ¨ν•˜κΈ° μœ„ν•΄ μ œμ‹œν•˜λŠ” 근거의 μ’…λ₯˜, 그리고 κ·Έ κ·Όκ±°λ₯Ό μ΄λ£¨λŠ” 논증 μ „λž΅ 등이 ν…μŠ€νŠΈμ˜ 언어적 νŠΉμ§•κ³Ό ν•¨κ»˜ μ€‘μš”ν•œ 자질둜 μž‘μš©λœλ‹€. λ…Όμ¦μ˜ μ „λž΅μœΌλ‘œλŠ” μ˜ˆμ‹œ, 인과, μ„ΈλΆ€ 사항에 λŒ€ν•œ μ„€λͺ…, 반볡 μ„œμˆ , μ •μ •, λ°°κ²½ 지식 제곡 등이 κ΄€μ°°λ˜μ—ˆλ‹€. 이듀 μ„ΈλΆ€ λΆ„λ₯˜λŠ” λ‹΄ν™” κ΄€κ³„μ˜ λŒ€λΆ„λ₯˜λ₯Ό κ΅¬μ„±ν•˜κ³ , κ·Έ λ‹΄ν™” 관계λ₯Ό μ˜ˆμΈ‘ν•˜λŠ” 데 μ“°μ΄λŠ” 자질의 기반이 λ˜μ—ˆλ‹€. λ˜ν•œ 일뢀 언어적 μžμ§ˆλ“€μ€ κΈ°μ‘΄ 연ꡬλ₯Ό μ°Έκ³ ν•˜μ—¬ ν•œκ΅­μ–΄ μžλ£Œμ— μ μš©ν•  수 μžˆλŠ” ν˜•νƒœλ‘œ μž¬κ΅¬μ„±ν•˜μ˜€λ‹€. 이λ₯Ό μ΄μš©ν•˜μ—¬ ν•œκ΅­μ–΄ μ½”νΌμŠ€λ₯Ό κ΅¬μΆ•ν•˜κ³  ν•œκ΅­μ–΄ 연ꡬ에 νŠΉν™”λœ 접속사 및 μ—°κ²°μ–΄μ˜ λͺ©λ‘μ„ κ΅¬μ„±ν•˜μ—¬ 자질 λͺ©λ‘μ— ν¬ν•¨μ‹œμΌ°λ‹€. μ΄λŸ¬ν•œ μžμ§ˆλ“€μ— κΈ°λ°˜ν•΄μ„œ λ‹΄ν™” 관계λ₯Ό μ˜ˆμΈ‘ν•˜λŠ” 과정을 이 μ—°κ΅¬μ—μ„œ λ…μžμ μΈ λͺ¨λΈλ‘œμ„œ μžλ™ν™”ν•˜μ—¬ μ œμ•ˆν•˜μ˜€λ‹€. 예츑 μ‹€ν—˜μ˜ κ²°κ³Όλ₯Ό 보면 λ³Έ μ—°κ΅¬μ—μ„œ μ •μ˜ν•˜μ—¬ μ΄μš©ν•œ μžμ§ˆλ“€μ€ 긍정적인 μƒν˜Έ μž‘μš©μ„ 톡해 λ‹΄ν™” 관계 예츑의 μ„±λŠ₯을 ν–₯μƒμ‹œν‚¨λ‹€λŠ” 것을 μ•Œ 수 μžˆμ—ˆλ‹€. κ·Έ μ€‘μ—μ„œλ„ 일뢀 접속사 및 μ—°κ²°μ–΄, λ¬Έμž₯ μ„±λΆ„μ˜ μœ λ¬΄μ— λ”°λ₯Έ 의쑴적인 λ¬Έμž₯ ꡬ쑰, 그리고 같은 λ‚΄μš©μ„ 반볡 μ„œμˆ ν•˜λŠ”μ§€μ˜ μ—¬λΆ€ 등이 특히 μ˜ˆμΈ‘μ— κΈ°μ—¬ν•˜μ˜€λ‹€. ν…μŠ€νŠΈλ₯Ό μ΄λ£¨λŠ” κΈ°λ³Έ λ‹¨μœ„λ“€ 사이에 μ‘΄μž¬ν•˜λŠ” λ‹΄ν™” 관계듀은 μ„œλ‘œ μ—°κ²°, ν•©μ„±λ˜μ–΄ ν…μŠ€νŠΈ 전체에 λŒ€μ‘λ˜λŠ” 트리 ν˜•νƒœμ˜ 논증 ꡬ쑰λ₯Ό 이룬닀. μ΄λ ‡κ²Œ 얻은 논증 ꡬ쑰에 λŒ€ν•΄μ„œλŠ”, 트리의 κ°€μž₯ μœ„μͺ½μΈ 루트 λ…Έλ“œμ— κΈ€μ˜ 주제문이 μœ„μΉ˜ν•˜κ³ , κ·Έ λ°”λ‘œ μ•„λž˜ μΈ΅μœ„μ— ν•΄λ‹Ήν•˜λŠ” λ¬Έμž₯(절)듀이 κ·Όκ±°λ‘œμ„œ κ°€μž₯ μ€‘μš”ν•œ λ‚΄μš©μ„ λ‹΄κ³  μžˆλ‹€κ³  κ°€μ •ν•  수 μžˆλ‹€. λ”°λΌμ„œ μ£Όμ œλ¬Έμ„ μ§μ ‘μ μœΌλ‘œ λ’·λ°›μΉ¨ν•˜λŠ” λ¬Έμž₯(절)을 μΆ”μΆœν•˜λ©΄ κΈ€μ˜ μ€‘μš” λ‚΄μš©μ„ μ–»κ²Œ λœλ‹€. μ΄λŠ” 곧 ν…μŠ€νŠΈ μš”μ•½ μž‘μ—…μ—μ„œ μœ μš©ν•˜κ²Œ μ“°μ΄λŠ” 방식이 될 수 μžˆλ‹€. λ˜ν•œ μ£Όμ œμ— λ”°λ₯Έ μž…μž₯ λΆ„λ₯˜λ‚˜ κ·Όκ±° μˆ˜μ§‘ λ“± λ‹€μ–‘ν•œ λΆ„μ•Όμ—μ„œλ„ μ‘μš©μ΄ κ°€λŠ₯ν•  것이닀.These days, there is an increased need to analyze mass opinions using on-line text data. These tasks need to recognize the argumentation schemes and main contents of subjective, argumentative writing, and the automatization of the required procedures is becoming indispensable. This thesis constructed the text data using Korean debates on certain political issues, and defined the types of discourse relations between basic units of text segments. The discourse relations are classified into two levels and four subclasses, according to the standards which determine whether the two segments are related to each other in a context, whether the relation is coordinating or subordinating, and which of the two units in a pair is supported by the other as a more important part. The relations between basic text units are predicted based on machine learning and rule-based methods. The features for the prediction of discourse relations include what the author of a text wants to claim and argumentative strategies comprising grounds for the author's claim, using linguistic properties shown in texts. The strategies for argument are observed and subcategorized into Providing Examples, Cause-and-Effects, Explanations in Detail, Restatements, Contrasts, Background Knowledge, and more. These subclasses compose a broader class of discourse relations and became the basis for features used during the classification of the relations. Some linguistic features refer to those of previous studies, they are reconstituted in a revised form which is more appropriate for Korean data. Thus, this study constructed a Korean debate corpus and a list of connectives specialized to deal with Korean texts to include in the experiment features. The automated prediction of discourse relations based on those features is suggested in this study as a unique model of argument mining. According to the results of experiments predicting discourse relations, the features defined and used in this study are observed to improve the performance of prediction tasks through positive interactions with each other. In particular, some explicit connectives, dependent sentence structures based on lack of certain components, and whether the same meanings are restated clearly contributed to the classification tasks. The discourse relations between basic text units are related and combined with each other to comprise a tree-form argumentation structure for the overall document. Regarding the argumentation structure, the topic sentence of the document is located at the root node in the tree, and it is assumed that the nodes of sentences or clauses right below the root node contain the most important contents as grounds for the topic unit. Therefore, extraction of the text segments directly supporting the topic sentence may help in obtaining the important contents in each document. This can be one of the useful methods in text summarization. Additionally, applications to various fields may also be possible, including stance classification of debate texts, extraction of grounds for certain topics, and so on.1 Introduction 1 1.1 Purposes 1 1.1.1 A Study of Korean Texts with Linguistic Cues 1 1.1.2 Detection of Argumentation Schemes in Debate Texts 2 1.1.3 Extraction of Important Content in Argumentation Schemes of Texts 2 1.2 Structure 3 2 Previous Work 5 2.1 Argumentation Mining Tasks 7 2.1.1 Argument Elements 7 2.1.2 Argumentation Schemes 9 2.2 Argumentation Schemes in Various Texts 14 2.2.1 Dialogic vs. Monologic Texts 14 2.2.2 Debate Texts vs. Other Texts 15 2.2.3 Studies in Other Languages 17 2.3 Theoretical Basis 18 2.3.1 Argumentation Theory 18 2.3.2 Discourse Theory 21 3 Identifying Argumentation Schemes in Debate Texts 25 3.1 Data Description 25 3.2 Basic Units 27 3.3 Discourse Relations 29 3.3.1 Strategies for Proving a Claim 29 3.3.2 Definition 35 4 Automatic Identification of Argumentation Schemes 41 4.1 Annotation 41 4.2 Baseline 46 4.3 Proposed Model 50 4.3.1 O vs. X Classification 51 4.3.2 Convergent Relation Rule 61 4.3.3 NN vs. NS vs. SN Classification 65 4.4 Evaluation 67 4.4.1 Measures 67 4.4.2 Results 68 4.5 Discussion 74 4.6 A Pilot Study on English Texts 81 5 Detecting Important Units 87 6 Conclusion 99 Bibliography 103 초둝 117Maste
    corecore