2 research outputs found

    Reputation Agent: Prompting Fair Reviews in Gig Markets

    Full text link
    Our study presents a new tool, Reputation Agent, to promote fairer reviews from requesters (employers or customers) on gig markets. Unfair reviews, created when requesters consider factors outside of a worker's control, are known to plague gig workers and can result in lost job opportunities and even termination from the marketplace. Our tool leverages machine learning to implement an intelligent interface that: (1) uses deep learning to automatically detect when an individual has included unfair factors into her review (factors outside the worker's control per the policies of the market); and (2) prompts the individual to reconsider her review if she has incorporated unfair factors. To study the effectiveness of Reputation Agent, we conducted a controlled experiment over different gig markets. Our experiment illustrates that across markets, Reputation Agent, in contrast with traditional approaches, motivates requesters to review gig workers' performance more fairly. We discuss how tools that bring more transparency to employers about the policies of a gig market can help build empathy thus resulting in reasoned discussions around potential injustices towards workers generated by these interfaces. Our vision is that with tools that promote truth and transparency we can bring fairer treatment to gig workers.Comment: 12 pages, 5 figures, The Web Conference 2020, ACM WWW 202

    Aspect-Level Analysis and Predictive Modeling for Electric Vehicle Based on Aspect-Based Sentiment Analysis Using Machine Learning

    Get PDF
    ํ•™์œ„๋…ผ๋ฌธ (์„์‚ฌ) -- ์„œ์šธ๋Œ€ํ•™๊ต ๋Œ€ํ•™์› : ๊ณต๊ณผ๋Œ€ํ•™ ์‚ฐ์—…๊ณตํ•™๊ณผ, 2020. 8. ์œค๋ช…ํ™˜.๋ณธ ๋…ผ๋ฌธ์—์„œ๋Š” ์ „๊ธฐ์ฐจ๋ฅผ ๋Œ€์ƒ์œผ๋กœ ๊ธฐ๊ณ„ํ•™์Šต์„ ์ด์šฉํ•œ Aspect-Based Sentiment Analysis(ABSA) ๊ธฐ๋ฐ˜ ์‚ฌ์šฉ์ž ๋ฆฌ๋ทฐ ๋ถ„์„์„ ํ†ตํ•ด, ์ฐจ๋Ÿ‰์˜ ์ฃผ์š” ์š”์†Œ(Aspect)์ธ ๋ถ€ํ’ˆ(Components) ๋ฐ ํŠน์ง•(Attributes)์„ ์ถ”์ถœํ•˜๊ณ , ์ถ”์ถœ๋œ ๊ฐ ์š”์†Œ์— ๋Œ€ํ•œ ์‚ฌ์šฉ์ž ๊ฐ์„ฑ ์˜ˆ์ธก ๋ชจ๋ธ๋ง ๊ธฐ๋ฐ˜์˜ UX ๋ถ„์„ ํ”„๋ ˆ์ž„์›Œํฌ(Framework)๋ฅผ ๊ตฌํ˜„ํ•˜์—ฌ ๊ธฐ์กด์˜ ์ธํ„ฐ๋ทฐ ๋ฐ ์„ค๋ฌธ์กฐ์‚ฌ์™€ ์œ ์‚ฌํ•œ ์ˆ˜์ค€์˜ ์‚ฌ์šฉ์ž ์˜๊ฒฌ์„ ์–ป๋Š” ๊ฒƒ์„ ์ฃผ์š” ๋ชฉํ‘œ๋กœ ํ•œ๋‹ค. ์ด ๊ณผ์ •์—์„œ ์ˆ˜๋ฐ˜๋˜๋Š” ๋ฐ์ดํ„ฐ ๋ถˆ๊ท ํ˜•(Data Imbalance) ๋ฌธ์ œ๋ฅผ ์˜ค๋ฒ„์ƒ˜ํ”Œ๋ง(Oversampling)์„ ํ†ตํ•ด ๊ทน๋ณตํ•˜๊ณ , ์‚ฌ์šฉ์ž ๋ฆฌ๋ทฐ ๋ถ€์กฑ ๋ฌธ์ œ ๊ทน๋ณต์„ ์œ„ํ•ด ๋ ˆ์ด๋ธ”์ด ์—†๋Š”(Non-label) ๋ฐ์ดํ„ฐ๋ฅผ ํ™œ์šฉํ•˜๋Š” ๋ฐฉ๋ฒ•์„ ์ œ์•ˆํ•œ๋‹ค. ๋”๋ถˆ์–ด ์ถ”์ถœ๋œ Aspect์— ๋Œ€ํ•œ ์ฐจ๋Ÿ‰ ์„ธ๋ถ€ ์ŠคํŽ™๊ณผ ์‚ฌ์šฉ์ž ๊ฐ์„ฑ ๊ฐ„์˜ ๊ด€๊ณ„์„ฑ ํ™•์ธ์„ ํ†ตํ•ด ๊ฐ์„ฑ์— ์˜ํ–ฅ์„ ์ฃผ๋Š” ์š”์†Œ(Contributing Factor)๋ฅผ ์ฐพ๋Š”๋‹ค. ์—ฐ๊ตฌ ๋ฐฉ๋ฒ•์€ ABSA์˜ ํฐ ํ‹€์„ ํ™œ์šฉํ•˜๋ฉฐ, ํฌ๊ฒŒ ๋ฐ์ดํ„ฐ ์ˆ˜์ง‘, ์ „์ฒ˜๋ฆฌ ๋ฐ Feature ์ƒ์„ฑ, ์š”์†Œ ์ถ”์ถœ(Aspect Extraction) ๋ฐ ๊ฐ์„ฑ ๋ถ„์„(Sentiment Analysis)์„ ์œ„ํ•œ ๋ชจ๋ธ๋ง, ๊ทธ๋ฆฌ๊ณ  ์š”์†Œ ๋ณ„ ์‚ฌ์šฉ์ž ๊ฐ์„ฑ ๋ถ„์„ ์ˆœ์„œ๋กœ ์ง„ํ–‰ํ•˜์˜€๋‹ค. ๋ฐ์ดํ„ฐ ์ˆ˜์ง‘์€ ๋Œ€ํ‘œ์ ์ธ ์ž๋™์ฐจ ํฌ๋Ÿผ์—์„œ ์‚ฌ์šฉ์ž ๋งŒ์กฑ๋„๊ฐ€ 5์  ์ฒ™๋„๋กœ ํ‰๊ฐ€๋œ Label ๋ฐ์ดํ„ฐ ์ด 5,065๊ฐœ๋ฅผ ์ˆ˜์ง‘ํ•˜์˜€๊ณ , ๋ฐ์ดํ„ฐ ๋ถ€์กฑ ๋ฌธ์ œ๋ฅผ ๊ทน๋ณตํ•˜๊ณ ์ž Youtube.com์—์„œ Non-label ๋ฐ์ดํ„ฐ๋ฅผ ์•ฝ 21๋งŒ๊ฐœ ์ˆ˜์ง‘ํ•˜์˜€์œผ๋ฉฐ ์ด ์ค‘ User Experience ๊ด€๋ จ ์–ดํœ˜๊ฐ€ ํฌํ•จ๋œ ๋ฆฌ๋ทฐ๋กœ ํ•œ์ •ํ•˜์—ฌ ์ด 6,488๊ฐœ๋ฅผ ์„ ๋ณ„ํ•˜์˜€๋‹ค. ์ดํ›„ ์ˆ˜์ง‘ ๋ฐ์ดํ„ฐ์˜ ์ „์ฒ˜๋ฆฌ ๋ฐ ๋ถ„์‚ฐ ํ‘œํ˜„(Distributed Representation)์„ ํ†ตํ•œ ํšจ๊ณผ์ ์ธ ์ž„๋ฒ ๋”ฉ ๊ณผ์ •์„ ๊ฑฐ์ณ ํŠน์ง•(Feature)์„ ์ƒ์„ฑํ•˜์˜€๋‹ค. ๋ถ„์„์€ ํฌ๊ฒŒ ๋‘ ๊ฐ€์ง€ ์ค„๊ธฐ๋กœ์จ, ์š”์†Œ ์ถ”์ถœ(Aspect Extraction)๊ณผ ๊ฐ์„ฑ ๋ถ„์„(Sentiment Analysis)๋กœ ๋‚˜๋‰œ๋‹ค. ์š”์†Œ ํƒ์ง€๋ฅผ ์œ„ํ•ด ๋น„์ง€๋„์  ๋ฐฉ๋ฒ•(Unsupervised Method)์ด์ž ์ถ”์ถœ์  ์ ‘๊ทผ ๋ฐฉ๋ฒ•(Extractive Approach)์œผ๋กœ์จ, TextRank์™€ Naรฏve Method๋ฅผ ํ™œ์šฉํ•˜์˜€๋‹ค. ๊ทธ ๋‹ค์Œ ์ง€๋„ํ•™์Šต(Supervised Learning) ๊ธฐ๋ฐ˜์˜ ๋ฌธ์žฅ ๊ฐ์„ฑ ๋ถ„๋ฅ˜ ๋ชจ๋ธ์„ ๊ตฌํ˜„ํ•˜๊ณ ์ž, Label์ด ์žˆ๋Š” ๋ฆฌ๋ทฐ ํ…์ŠคํŠธ ์„œ๋‘์˜ ํ•œ ๋‘๋ฌธ์žฅ์œผ๋กœ ๊ตฌ์„ฑ๋œ ์ ˆ๋‹จ๋œ ํ…์ŠคํŠธ๋ฅผ ํ•™์Šต์‹œํ‚จ ๋ชจ๋ธ์„ ๊ตฌ์ถ•ํ•˜์˜€๊ณ , ์ค€์ง€๋„ํ•™์Šต์„ ํ†ตํ•ด ๋” ๋‚˜์€ ์„ฑ๋Šฅ์˜ ๋ชจ๋ธ์„ ๊ตฌํ˜„ํ•˜๊ณ ์ž ํ•˜์˜€๋‹ค. ์ด๋ฅผ ๋ฐ”ํƒ•์œผ๋กœ ์„ ์ •๋œ Aspect๊ฐ€ ํฌํ•จ๋œ ๋ฌธ์žฅ์— ๋Œ€ํ•œ ๊ฐ์„ฑ ๋ถ„์„์„ ์‹ค์‹œํ•จ์œผ๋กœ์จ ์š”์†Œ๋ณ„ ๊ฐ์„ฑ ๋ถ„์„์„ ์ง„ํ–‰ํ•˜๊ณ , ๋”๋ถˆ์–ด ์‚ฌ์šฉ์ž ๊ฐ์„ฑ์— ์˜ํ–ฅ๋ ฅ ์žˆ๋Š” ์ฐจ๋Ÿ‰ ์„ธ๋ถ€ ์ŠคํŽ™์„ ์ฐพ์•„ Contributing Factor๋ฅผ ๋ฐœ๊ตดํ•˜๊ณ ์ž ํ•˜์˜€๋‹ค. ์—ฐ๊ตฌ ๊ฒฐ๊ณผ๋กœ์จ, ์š”์†Œ ์ถ”์ถœ(Aspect Extraction)๋กœ๋Š” ์ด 16๊ฐœ ์นดํ…Œ๊ณ ๋ฆฌ์˜ ์ฃผ์š” Aspects(8๊ฐœ์˜ ์ฃผ์š” ์ „๊ธฐ์ฐจ ๊ตฌ์„ฑ ์š”์†Œ์™€ 8๊ฐœ์˜ ์ฃผ์š” Human Factor ํŠน์„ฑ)๊ฐ€ ์ถ”์ถœ๋˜์—ˆ๋Š”๋ฐ, ์ด ์ค‘ ์‚ฌ์šฉ์ž๋Š” Acceleration / Room / Interior / Power / Safety / Ergonomics / Price / Power์— ๋Œ€ํ•ด ๊ธ์ •์ ์ด๋ฉฐ, Seat / Battery / Charge / Noise / Winter / Ice์— ๋Œ€ํ•ด ๋‹ค์†Œ ๋ถ€์ •์ ์ž„์„ ํ™•์ธํ•˜์˜€๋‹ค. ๊ฐ์„ฑ ๋ถ„์„(Sentiment Analysis)์—์„œ๋Š” CNN ๋ชจ๋ธ์ด ๋ฆฌ๋ทฐ ๋‹จ์œ„ ๊ฐ์„ฑ ๋ถ„๋ฅ˜์— ์žˆ์–ด ๊ฐ€์žฅ ๋†’์€ ์„ฑ๋Šฅ์„ ๋ณด์˜€๋‹ค. ๋”ฐ๋ผ์„œ CNN์„ ํ™œ์šฉํ•œ ์ค€์ง€๋„ํ•™์Šต(Semi-Supervised Learning)์„ ํ†ตํ•ด Non-Label Data ์ค‘ 80% ์ด์ƒ์˜ ๋ถ„๋ฅ˜ ํ™•๋ฅ ์ด ๋†’์€ ๋ฐ์ดํ„ฐ ์œ„์ฃผ๋กœ Pseudo Label์„ ๋ถ€์—ฌํ•˜์˜€๊ณ , ์ด๋ฅผ ํฌํ•จํ•œ ์ „์ฒด ๋ฐ์ดํ„ฐ๋ฅผ ์žฌํ•™์Šต์„ ๊ฑฐ์น˜๋Š” ๋ฐฉ๋ฒ•์œผ๋กœ ๋ชจ๋ธ์˜ ์„ฑ๋Šฅ ํ–ฅ์ƒ์„ ํ™•์ธํ•˜์˜€๋‹ค. ๋˜ํ•œ ์ถ”์ถœ๋œ ์š”์†Œ๊ฐ€ ํฌํ•จ๋œ ๋ฌธ์žฅ ๋‹จ์œ„ ๊ฐ์„ฑ ๋ถ„๋ฅ˜์— ๋Œ€ํ•˜์—ฌ, ๊ธฐ๊ณ„ํ•™์Šต ๋ชจ๋ธ ๊ธฐ๋ฐ˜์œผ๋กœ ๊ฒฐ๊ณผ์™€ Lexicon ๊ธฐ๋ฐ˜ ๊ฐ์„ฑ ๋ถ„๋ฅ˜ ๊ฒฐ๊ณผ ๊ฐ„ 17๊ฐœ Aspect ์ค‘ 14๊ฐœ๊ฐ€ ์˜ˆ์ธก ๋ฐฉํ–ฅ์„ฑ์ด ์ผ์น˜ํ•จ์„ ํ™•์ธํ•จ์œผ๋กœ์จ, ๊ธฐ๊ณ„ํ•™์Šต ๊ธฐ๋ฐ˜ ๊ฐ์„ฑ ๋ถ„๋ฅ˜ ๋ชจ๋ธ์˜ ํƒ€๋‹น์„ฑ์„ ๊ฐ„์ ‘์ ์œผ๋กœ ํ™•์ธํ•˜์˜€๋‹ค. ๋งˆ์ง€๋ง‰์œผ๋กœ ์ƒ˜ํ”Œ ๊ฒ€์ฆ์„ ํ†ตํ•ด ๋ณธ ์—ฐ๊ตฌ์—์„œ ํ•™์Šต๋œ ๋”ฅ๋Ÿฌ๋‹ ๋ชจ๋ธ์˜ ๋†’์€ ๋ถ„๋ฅ˜ ์ •ํ™•๋„๋ฅผ ํ™•์ธํ•˜์˜€๋Š”๋ฐ, ๋”ฅ๋Ÿฌ๋‹ ๋ชจ๋ธ์ด ๋‹จ์–ด ์˜๋ฏธ ์ด์ƒ์œผ๋กœ ๋ฌธ์žฅ ๋ฌธ๋งฅ์„ ํŒŒ์•…ํ•˜์—ฌ ๊ธ์ •/๋ถ€์ • ๋ถ„๋ฅ˜ํ•˜์˜€์Œ์„ ํ™•์ธํ•˜์˜€๋‹ค. ๊ฒฐ๋ก ์ ์œผ๋กœ Aspect ๊ธฐ๋ฐ˜์˜ ๋ฌธ์žฅ๋‹จ์œ„ ๋ถ„์„์„ ํ†ตํ•ด ๋ณด๋‹ค ๋” ๋‹ค์–‘ํ•œ ํ† ํ”ฝ๊ณผ ํŽธํ–ฅ๋˜์ง€ ์•Š์€ ์˜๊ฒฌ์„ ์ถ”์ถœํ•  ์ˆ˜ ์žˆ์Œ์„ ๋ณด์˜€๋‹ค. ๋”๋ถˆ์–ด ๋ฆฌ๋ทฐ ๋ฐ์ดํ„ฐ๋ฅผ Over-sampling์„ ํ•˜์—ฌ Data Imbalance ๋ฌธ์ œ๋ฅผ ์ ‘๊ทผํ•จ์œผ๋กœ์จ ์˜จ๋ผ์ธ ๋ฆฌ๋ทฐ์˜ ๊ธ์ • ํŽธํ–ฅ์„ฑ์„ ๊ทน๋ณตํ•˜๊ณ , Semi-Supervised Learning์„ ํ†ตํ•œ Non-Label Data ํ™œ์šฉ ๋ฐฉ๋ฒ•์„ ํ†ตํ•ด ์‚ฌ์šฉ์ž ํ‰๊ฐ€๊ฐ€ ๋งŽ์ด ๋ถ€์กฑํ•œ ์ œํ’ˆ์— ๋Œ€ํ•ด ๋ณด๋‹ค ํšจ๊ณผ์ ์ธ UX ๋ถ„์„ ํ”„๋ ˆ์ž„์›Œํฌ๋ฅผ ์ œ์•ˆํ•˜์˜€๋‹ค.In this study, we extract main components and attributes, which are the main aspects of Electric Vehicle by analyzing User Experience based on Aspect-Based Sentiment Analysis (ABSA) using machine learning, overcoming the problems accompanying in this process such as Data Imbalance and insufficient user reviews by making use of non-label data. In addition, we find the contributing factors affecting users sentiments by figuring out the relationship between user's sentiment to each aspect extracted and detailed specifications of Electric Vehicle with regression. Based on the ABSA method, and we perform data collection, data preprocessing, feature engineering, Aspect Extraction, modeling for sentiment analysis, and evaluating user sentiment to each aspect in sequence. For data collection, a total of 5,065 label data, which is evaluated with a 5-point scale by users, was collected from representative car forums. At the same time, in order to overcome the shortage of data and data imbalance, approximately 210,000 items of non-label data are collected from Youtube.com, of which 6,488 items were selected by filtering with limited to the user experience related only. And then, feature engineering is performed with effective embedding methods of distributed representation after data pre-processing. The analysis phase is mainly divided into two processes: Aspect Extraction and Sentiment Analysis. First of all, TextRank and Naรฏve methods were used as an unsupervised method and an extractive approach for Aspect Extraction. Then, in order to implement a sentiment classification model based on supervised learning with high performance, we built a machine learning model that trains the truncated text composed of one or two sentences at the beginning of a review text with a label and make it improved by means of semi-supervised learning. With the model trained, we are able to perform aspect-wise sentiment analysis by conducting sentiment analysis on the sentence that including the selected aspect term. Further, we find detailed specifications of vehicle that have an influence on user sentiment as contributing factors that affects users sentiment. As a result, 16 categories of main aspects were extracted, eight key EV Components & eight key Human Factor Attributes, of which the users are likely to be positive to Acceleration, Room, Interior, Power, Safety, Ergonomics, Price, Power and negative to Seat, Battery, Charge, Noise, Winter, Ice. In sentiment analysis, the CNN model showed the highest performance in sentiment classification. Therefore, through semi-supervised learning using CNN, label propagation was performed among non-label data, giving the pseudo label to only the data with a high classification probability more that 80%, resulting in improvement in performance of the CNN model. Lastly, we confirmed the high classification accuracy of the deep learning model for predicting the users sentiment of the sentences. In addition, with regard to aspect-wise sentiment analysis, there was a tendency to predict the users sentiment similarly between machine learning based and lexicon-based, which showed machine learning based model is robust as much as lexicon-based. In conclusion, it was shown that more diverse topics and unbiased opinions could be extracted through aspect-wise analysis than review-wise. In addition, we verified that the imbalance problem could be overcome by over-sampling Finally, a more effective UX analysis framework for the products that have not sufficient user reviews was proposed by taking advantage of non-label data with semi-supervised learning.์ œ 1 ์žฅ ์„œ๋ก  1 1.1 ์—ฐ๊ตฌ ๋ฐฐ๊ฒฝ 1 1.2 ์—ฐ๊ตฌ ๋Œ€์ƒ 3 ์ œ 2 ์žฅ ์—ฐ๊ตฌ ๋ชฉํ‘œ 5 2.1 ์—ฐ๊ตฌ ๋ชฉํ‘œ 5 2.2 ์„ ํ–‰ ์—ฐ๊ตฌ 7 2.2.1 Aspect-Based Sentiment Analysis 7 2.2.2 ์š”์†Œ ์ถ”์ถœ(Aspect Extraction) 9 2.2.3 ๊ฐ์„ฑ ๋ถ„์„(Sentiment Analysis) 12 ์ œ 3 ์žฅ ์—ฐ๊ตฌ ๋ฐฉ๋ฒ• 18 3.1 ๋ฐ์ดํ„ฐ ์ˆ˜์ง‘ 19 3.1.1 Label ๋ฐ์ดํ„ฐ ์ˆ˜์ง‘ 19 3.1.2 Non-Label ๋ฐ์ดํ„ฐ ์ˆ˜์ง‘ 19 3.1.3 ๋ฐ์ดํ„ฐ ๋ถ„ํฌ 20 3.2 ๋ฐ์ดํ„ฐ ์ „์ฒ˜๋ฆฌ 21 3.3 Aspect-Based Sentiment Analysis ์ˆ˜ํ–‰ 24 3.3.1 ์š”์†Œ ์ถ”์ถœ(Aspect Extraction) 25 3.3.2 ๊ฐ์„ฑ ๋ถ„์„(Sentiment Analysis) 27 3.3.3 A Framework for UX Analysis 33 ์ œ 4 ์žฅ ์—ฐ๊ตฌ ๊ฒฐ๊ณผ 35 4.1 ์š”์†Œ ์ถ”์ถœ(Aspect Extraction) ๊ฒฐ๊ณผ 34 4.2 ๊ฐ์„ฑ ๋ถ„์„(Sentiment Analysis)์„ ์œ„ํ•œ ๋ชจ๋ธ๋ง ๊ฒฐ๊ณผ 36 4.2.1 ๊ธฐ๊ณ„ํ•™์Šต ๊ธฐ๋ฐ˜ ๊ฐ์„ฑ ๋ถ„๋ฅ˜ ๊ฒฐ๊ณผ์— ๋Œ€ํ•œ ๋ชจ๋ธ ๋ณ„ ์„ฑ๋Šฅ ๋น„๊ต 36 4.2.2 ์ค€์ง€๋„ํ•™์Šต(Semi Supervised Learning) ์‹คํ—˜ ๊ฒฐ๊ณผ 38 4.3 Aspect Based Sentiment Analysis ๊ฒฐ๊ณผ 39 4.3.1 ๊ธฐ๊ณ„ํ•™์Šต ๋ชจ๋ธ ๊ธฐ๋ฐ˜ ABSA ๊ฒฐ๊ณผ 39 4.3.2 Lexicon ๊ธฐ๋ฐ˜ ABSA ๊ฒฐ๊ณผ 40 4.4 ์‚ฌ์šฉ์ž ๊ธ/๋ถ€์ • ๊ฒฝํ—˜์— ์˜ํ–ฅ์„ ๋ฏธ์น˜๋Š” Contributing Factor 42 ์ œ 5 ์žฅ ๊ฒฐ๋ก  44 5.1 ๊ฒฐ๋ก (Conclusion) 44 5.2 ์—ฐ๊ตฌ ๊ธฐ์—ฌ(Contribution) 46 5.3 ํ•œ๊ณ„์ (Limitation) 46 Appendix 47 ์ฐธ๊ณ ๋ฌธํ—Œ 49 Abstract 55 ๊ฐ์‚ฌ์˜ ๊ธ€ 57Maste
    corecore