6 research outputs found

    ๊ฐ์„ฑ ๋ถ„์„์˜ ์—ฌ๋ก  ์กฐ์‚ฌ ๋Œ€์ฒด ๊ฐ€๋Šฅ ์—ฐ๊ตฌ

    No full text
    ํ•™์œ„๋…ผ๋ฌธ(์„์‚ฌ) -- ์„œ์šธ๋Œ€ํ•™๊ต๋Œ€ํ•™์› : ๊ณต๊ณผ๋Œ€ํ•™ ํ˜‘๋™๊ณผ์ • ๊ธฐ์ˆ ๊ฒฝ์˜ยท๊ฒฝ์ œยท์ •์ฑ…์ „๊ณต, 2022. 8. ํ™ฉ์ค€์„.Sentimental analysis in the Korean language does not deliver ideal results due to the lack of data and unavailability of a high functioning Korean language model. However, with the rise of modern statistical artificial intelligence, numerous studies have been using this methodology to analyze various topics. Sentimental analysis is a field within natural language processing specializing in determining what emotion a comment express. This research aims to propose a framework that can be used to perform sentimental analysis and automate the process of looking into the public opinion of people. Public opinion polls usually have a question that can be answered with โ€œyesโ€ and โ€œnoโ€. Since sentimental analysis shows high accuracy in determining whether a comment depicts a positive or negative emotion, they can be a viable way to aid the process of determining public opinion in big data. To test the hypothesis that this paperโ€™s automated sentimental analysis framework can be a viable substitute to surveys, 260 public opinion survey results that had been conducted by professional institutes were used to see how similar the two results are. Cosine similarity was used to compare the two results, and the results showed that 160 out of 260 expert surveys had an accuracy above 90%. Another interesting finding showed that the more expert survey results had a low similarity for each year elapsed. These results suggest that the paperโ€™s automated sentimental analysis framework can indeed be a viable substitute to public opinion surveys as long as the answers to the question is a โ€œyesโ€ and โ€œnoโ€. This framework is especially useful for future research because results show that the proposed framework shows high similarity to existing expert public opinion survey results.ํ•œ๊ตญ์–ด ๊ฐ์„ฑ ๋ถ„์„์€ ๋ฐ์ดํ„ฐ์˜ ๋ถ€์กฑ๊ณผ ๊ณ ์„ฑ๋Šฅ ํ•œ๊ตญ์–ด ๋ชจ๋ธ์˜ ๋ถ€์žฌ๋กœ ์ธํ•ด ์ด์ƒ์ ์ธ ๊ฒฐ๊ณผ๋ฅผ ์ œ๊ณตํ•˜์ง€ ๋ชปํ•œ๋‹ค. ํ˜„์žฌ ํ†ต๊ณ„ ๊ธฐ๋ฐ˜ ์ธ๊ณต์ง€๋Šฅ์˜ ๋“ฑ์žฅ์œผ๋กœ ์ˆ˜๋งŽ์€ ์—ฐ๊ตฌ์—์„œ ๋‹ค์–‘ํ•œ ์ฃผ์ œ๋ฅผ ๋ถ„์„ํ•˜๊ธฐ ์œ„ํ•ด ์ด ๋ฐฉ๋ฒ•๋ก ์„ ์‚ฌ์šฉํ•ด ์™”๋‹ค. ๊ฐ์„ฑ ๋ถ„์„์€ ๋ฌธ์žฅ์ด ์–ด๋–ค ๊ฐ์ •์„ ํ‘œํ˜„ํ•˜๋Š”์ง€๋ฅผ ๊ฒฐ์ •ํ•˜๋Š” ๋ฐ ํŠนํ™”๋œ ์ž์—ฐ์–ด ์ฒ˜๋ฆฌ ๋ถ„์•ผ์ด๋‹ค. ๋ณธ ์—ฐ๊ตฌ๋Š” ๊ฐ์„ฑ ๋ถ„์„์„ ์ˆ˜ํ–‰ํ•˜์—ฌ ์‚ฌ๋žŒ๋“ค์˜ ์—ฌ๋ก ์„ ํŒŒ์•…ํ•˜๋Š” ๊ณผ์ •์„ ์ž๋™ํ™”ํ•˜๋Š” ๋ฐ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ๋Š” ํ”„๋ ˆ์ž„์›Œํฌ๋ฅผ ์ œ์•ˆํ•˜๋Š” ๊ฒƒ์„ ๋ชฉํ‘œ๋กœ ํ•œ๋‹ค. ์—ฌ๋ก ์กฐ์‚ฌ๋Š” ๋ณดํ†ต "์˜ˆ"์™€ "์•„๋‹ˆ์š”"๋กœ ๋Œ€๋‹ตํ•  ์ˆ˜ ์žˆ๋Š” ์งˆ๋ฌธ์„ ๊ฐ€์ง€๊ณ  ์žˆ๋‹ค. ๊ฐ์„ฑ ๋ถ„์„์€ ์ฝ”๋ฉ˜ํŠธ๊ฐ€ ๊ธ์ •์ ์ธ ๊ฐ์ •์„ ๋ฌ˜์‚ฌํ•˜๊ณ  ์žˆ๋Š”์ง€ ๋ถ€์ •์ ์ธ ๊ฐ์ •์„ ๋ฌ˜์‚ฌํ•˜๊ณ  ์žˆ๋Š”์ง€๋ฅผ ํŒ๋‹จํ•˜๋Š” ๋ฐ ๋†’์€ ์ •ํ™•๋„๋ฅผ ๋ณด์—ฌ์ฃผ๋ฏ€๋กœ, ๊ทธ๊ฒƒ๋“ค์€ ๋น…๋ฐ์ดํ„ฐ์—์„œ ์—ฌ๋ก ์„ ๊ฒฐ์ •ํ•˜๋Š” ๊ณผ์ •์„ ๋•๋Š” ์‹คํ–‰ ๊ฐ€๋Šฅํ•œ ๋ฐฉ๋ฒ•์ด ๋  ์ˆ˜ ์žˆ๋‹ค. ๋ณธ ๋…ผ๋ฌธ์˜ ์ž๋™ํ™”๋œ ๊ฐ์„ฑ๋ถ„์„ ํ”„๋ ˆ์ž„์›Œํฌ๊ฐ€ ์„ค๋ฌธ์กฐ์‚ฌ์˜ ์‹คํ–‰ ๊ฐ€๋Šฅํ•œ ๋Œ€์ฒด๊ฐ€ ๋  ์ˆ˜ ์žˆ๋‹ค๋Š” ๊ฐ€์„ค์„ ํ…Œ์ŠคํŠธํ•˜๊ธฐ ์œ„ํ•ด ์ „๋ฌธ๊ธฐ๊ด€๋“ค์ด ์ง„ํ–‰ํ•ด์˜จ 260์—ฌ๊ฐœ์˜ ์—ฌ๋ก ์กฐ์‚ฌ ๊ฒฐ๊ณผ๋ฅผ ํ™œ์šฉํ•ด ๋‘ ๊ฒฐ๊ณผ๊ฐ€ ์–ผ๋งˆ๋‚˜ ์œ ์‚ฌํ•œ์ง€ ์‚ดํŽด๋ดค๋‹ค. ๋‘ ๊ฒฐ๊ณผ๋ฅผ ๋น„๊ตํ•˜๊ธฐ ์œ„ํ•ด ์ฝ”์‚ฌ์ธ ์œ ์‚ฌ๋„๋ฅผ ์‚ฌ์šฉํ•˜์˜€์œผ๋ฉฐ, 260๊ฐœ์˜ ์ „๋ฌธ ์—ฌ๋ก  ์กฐ์‚ฌ ๊ฒฐ๊ณผ ์ค‘ 160๊ฐœ๊ฐ€ ์œ ์‚ฌ๋„ 90%๋ฅผ ๋„˜๋Š” ๊ฒƒ์œผ๋กœ ๋‚˜ํƒ€๋‚ฌ๋‹ค. ๋˜ ๋‹ค๋ฅธ ํฅ๋ฏธ๋กœ์šด ๋ฐœ๊ฒฌ์€ ํ•œ ํ•ด๊ฐ€ ์ง€๋‚  ๋•Œ๋งˆ๋‹ค ๋น„๋ก€์ ์œผ๋กœ ๋” ๋งŽ์€ ์—ฌ๋ก  ์กฐ์‚ฌ ๊ฒฐ๊ณผ๊ฐ€ ์œ ์‚ฌ์„ฑ ๋‚ฎ๊ฒŒ ๋‚˜์˜จ๋‹ค๋Š” ๊ฒƒ์„ ๋ณด์—ฌ์ฃผ์—ˆ๋‹ค. ์ด๋Ÿฌํ•œ ๊ฒฐ๊ณผ๋Š” ์ด ๋…ผ๋ฌธ์˜ ์ž๋™ํ™”๋œ ๊ฐ์ • ๋ถ„์„ ํ”„๋ ˆ์ž„์›Œํฌ๊ฐ€ "์˜ˆ" "์•„๋‹ˆ์š”"์ธ ์งˆ๋ฌธ๋“ค์„ ํ•œํ•ด์„œ ์ถฉ๋ถ„ํžˆ ์—ฌ๋ก  ์กฐ์‚ฌ์˜ ๋Œ€์ฒด๋ฌผ์ด ๋  ์ˆ˜ ์žˆ์Œ์„ ์‹œ์‚ฌํ•œ๋‹ค. ์ œ์•ˆ๋œ ํ”„๋ ˆ์ž„์›Œํฌ๋ฅผ ํ†ตํ•ด ๊ธฐ์กด ์ „๋ฌธ๊ฐ€ ์—ฌ๋ก  ์กฐ์‚ฌ ๊ฒฐ๊ณผ์™€ ๋†’์€ ์œ ์‚ฌ์„ฑ์„ ๋ณด์ธ๋‹ค๋Š” ๊ฒฐ๊ณผ๊ฐ€ ๋‚˜์™”๊ธฐ ๋•Œ๋ฌธ์— ์ด ํ”„๋ ˆ์ž„์›Œํฌ๋Š” ํ–ฅํ›„ ์—ฐ๊ตฌ์— ํŠนํžˆ ์œ ์šฉํ•  ๊ฒƒ์œผ๋กœ ๋ณด์ธ๋‹ค.Chapter 1. Introduction 1 1.1.1 Implementing AI into Social Science 1 1.1.2 Automated Response System Surveys 1 1.1.3 Difficulties in Korean NLP 4 1.2 Background 10 1.3 Purpose of Research 12 Chapter 2. Literature Review 16 2.1 Computational Text Analysis in Social Science 16 2.2 Natural Language Processing Research in Korean 17 2.3 History of Language Models 18 Chapter 3. Methodology 20 3.1 Definition 20 3.2 Method Analysis 20 3.2.1 Required Equipment 20 3.2.2 Selected Language Model: BERT 21 3.2.3 Naver Movie Review corpus 22 3.2.4 Data Extraction Platform 24 3.2.5 Predict Sentiment 25 3.2.6 Link crawler 26 3.2.7 Cosine similarity 29 3.2.8 Euclidean Similarity 31 3.3 Data Gathering 32 3.4 Data Arrangement 34 Chapter 4. Result and Analysis 35 4.1 Result 35 4.2 Further Analysis I 40 4.3 Further Analysis II 43 4.4 Comparing different language models 52 4.5 For Discussion 53 Chapter 5. Conclusion 55 Bibliography 62 Abstract (Korean) 71์„
    corecore