397 research outputs found

    SURVEY ON REVIEW SPAM DETECTION

    Get PDF
    The proliferation of E-commerce sites has made web an excellent source of gathering customer reviews about products; as there is no quality control anyone one can write anything which leads to review spam. This paper previews and reviews the substantial research on Review Spam detection technique. Further it provides state of art depicting some previous attempt to study review spam detection

    Opinion spam detection: using multi-iterative graph-based model

    Get PDF
    The demand to detect opinionated spam, using opinion mining applications to prevent their damaging effects on e-commerce reputations is on the rise in many business sectors globally. The existing spam detection techniques in use nowadays, only consider one or two types of spam entities such as review, reviewer, group of reviewers, and product. Besides, they use a limited number of features related to behaviour, content and the relation of entities which reduces the detection's accuracy. Accordingly, these techniques mostly exploit synthetic datasets to analyse their model and are not able to be applied in the context of the real-world environment. As such, a novel graph-based model called โ€œMulti-iterative Graph-based opinion Spam Detectionโ€ (MGSD) in which all various types of entities are considered simultaneously within a unified structure is proposed. Using this approach, the model reveals both implicit (i.e., similar entity's) and explicit (i.e., different entitiesโ€™) relationships. The MGSD model is able to evaluate the โ€˜spamicityโ€™ effects of entities more efficiently given it applies a novel multi-iterative algorithm which considers different sets of factors to update the spamicity score of entities. To enhance the accuracy of the MGSD detection model, a higher number of existing weighted features along with the novel proposed features from different categories were selected using a combination of feature fusion techniques and machine learning (ML) algorithms. The MGSD model can also be generalised and applied in various opinionated documents due to employing domain independent features. The output of the MGSD model showed that our feature selection and feature fusion techniques showed a remarkable improvement in detecting spam. The findings of this study showed that MGSD could improve the accuracy of state-of-the-art ML and graph-based techniques by around 5.6% and 4.8%, respectively, also achieving an accuracy of 93% for the detection of spam detection in our synthetic crowdsourced dataset and 95.3% for Ott's crowdsourced dataset

    Man vs machine โ€“ Detecting deception in online reviews

    Get PDF
    This study focused on three main research objectives: analyzing the methods used to identify deceptive online consumer reviews, evaluating insights provided by multi-method automated approaches based on individual and aggregated review data, and formulating a review interpretation framework for identifying deception. The theoretical framework is based on two critical deception-related models, information manipulation theory and self-presentation theory. The findings confirm the interchangeable characteristics of the various automated text analysis methods in drawing insights about review characteristics and underline their significant complementary aspects. An integrative multi-method model that approaches the data at the individual and aggregate level provides more complex insights regarding the quantity and quality of review information, sentiment, cues about its relevance and contextual information, perceptual aspects, and cognitive material

    ์†Œ์…œ ๋„คํŠธ์›Œํฌ์™€ ์ด์ปค๋จธ์Šค ํ”Œ๋žซํผ์—์„œ์˜ ์ž ์žฌ ๋„คํŠธ์›Œํฌ ๋งˆ์ด๋‹

    Get PDF
    ํ•™์œ„๋…ผ๋ฌธ(๋ฐ•์‚ฌ) -- ์„œ์šธ๋Œ€ํ•™๊ต๋Œ€ํ•™์› : ๊ณต๊ณผ๋Œ€ํ•™ ์ปดํ“จํ„ฐ๊ณตํ•™๋ถ€, 2023. 2. ๊ถŒํƒœ๊ฒฝ.์›น ๊ธฐ๋ฐ˜ ์„œ๋น„์Šค์˜ ํญ๋ฐœ์ ์ธ ๋ฐœ๋‹ฌ๋กœ ์‚ฌ์šฉ์ž๋“ค์€ ์˜จ๋ผ์ธ ์ƒ์—์„œ ํญ๋„“๊ฒŒ ์—ฐ๊ฒฐ๋˜๊ณ  ์žˆ๋‹ค. ์˜จ๋ผ์ธ ํ”Œ๋žซํผ ์ƒ์—์„œ, ์‚ฌ์šฉ์ž๋“ค์€ ์„œ๋กœ์—๊ฒŒ ์˜ํ–ฅ์„ ์ฃผ๊ณ ๋ฐ›์œผ๋ฉฐ ์˜์‚ฌ ๊ฒฐ์ •์— ๊ทธ๋“ค์˜ ๊ฒฝํ—˜๊ณผ ์˜๊ฒฌ์„ ๋ฐ˜์˜ํ•˜๋Š” ๊ฒฝํ–ฅ์„ ๋ณด์ธ๋‹ค. ๋ณธ ํ•™์œ„ ๋…ผ๋ฌธ์—์„œ๋Š” ๋Œ€ํ‘œ์ ์ธ ์˜จ๋ผ์ธ ํ”Œ๋žซํผ์ธ ์†Œ์…œ ๋„คํŠธ์›Œํฌ ์„œ๋น„์Šค์™€ ์ด์ปค๋จธ์Šค ํ”Œ๋žซํผ์—์„œ์˜ ์‚ฌ์šฉ์ž ํ–‰๋™์— ๋Œ€ํ•ด ์—ฐ๊ตฌํ•˜์˜€๋‹ค. ์˜จ๋ผ์ธ ํ”Œ๋žซํผ์—์„œ์˜ ์‚ฌ์šฉ์ž ํ–‰๋™์€ ์‚ฌ์šฉ์ž์™€ ํ”Œ๋žซํผ ๊ตฌ์„ฑ ์š”์†Œ ๊ฐ„์˜ ๊ด€๊ณ„๋กœ ํ‘œํ˜„ํ•  ์ˆ˜ ์žˆ๋‹ค. ์‚ฌ์šฉ์ž์˜ ๊ตฌ๋งค๋Š” ์‚ฌ์šฉ์ž์™€ ์ƒํ’ˆ ๊ฐ„์˜ ๊ด€๊ณ„๋กœ, ์‚ฌ์šฉ์ž์˜ ์ฒดํฌ์ธ์€ ์‚ฌ์šฉ์ž์™€ ์žฅ์†Œ ๊ฐ„์˜ ๊ด€๊ณ„๋กœ ๋‚˜ํƒ€๋‚ด์ง„๋‹ค. ์—ฌ๊ธฐ์— ํ–‰๋™์˜ ์‹œ๊ฐ„๊ณผ ๋ ˆ์ดํŒ…, ํƒœ๊ทธ ๋“ฑ์˜ ์ •๋ณด๊ฐ€ ํฌํ•จ๋  ์ˆ˜ ์žˆ๋‹ค. ๋ณธ ์—ฐ๊ตฌ์—์„œ๋Š” ๋‘ ํ”Œ๋žซํผ์—์„œ ์ •์˜๋œ ์‚ฌ์šฉ์ž์˜ ํ–‰๋™ ๊ทธ๋ž˜ํ”„์— ์˜ํ–ฅ์„ ๋ฏธ์น˜๋Š” ์ž ์žฌ ๋„คํŠธ์›Œํฌ๋ฅผ ํŒŒ์•…ํ•˜๋Š” ์—ฐ๊ตฌ๋ฅผ ์ œ์‹œํ•œ๋‹ค. ์œ„์น˜ ๊ธฐ๋ฐ˜์˜ ์†Œ์…œ ๋„คํŠธ์›Œํฌ ์„œ๋น„์Šค์˜ ๊ฒฝ์šฐ ํŠน์ • ์žฅ์†Œ์— ๋ฐฉ๋ฌธํ•˜๋Š” ์ฒดํฌ์ธ ํ˜•์‹์œผ๋กœ ๋งŽ์€ ํฌ์ŠคํŠธ๊ฐ€ ๋งŒ๋“ค์–ด์ง€๋Š”๋ฐ, ์‚ฌ์šฉ์ž์˜ ์žฅ์†Œ ๋ฐฉ๋ฌธ์€ ์‚ฌ์šฉ์ž ๊ฐ„์— ์‚ฌ์ „์— ์กด์žฌํ•˜๋Š” ์นœ๊ตฌ ๊ด€๊ณ„์— ์˜ํ•ด ์˜ํ–ฅ์„ ํฌ๊ฒŒ ๋ฐ›๋Š”๋‹ค. ์‚ฌ์šฉ์ž ํ™œ๋™ ๋„คํŠธ์›Œํฌ์˜ ์ €๋ณ€์— ์ž ์žฌ๋œ ์‚ฌ์šฉ์ž ๊ฐ„์˜ ๊ด€๊ณ„๋ฅผ ํŒŒ์•…ํ•˜๋Š” ๊ฒƒ์€ ํ™œ๋™ ์˜ˆ์ธก์— ๋„์›€์ด ๋  ์ˆ˜ ์žˆ์œผ๋ฉฐ, ์ด๋ฅผ ์œ„ํ•ด ๋ณธ ๋…ผ๋ฌธ์—์„œ๋Š” ๋น„์ง€๋„ํ•™์Šต ๊ธฐ๋ฐ˜์œผ๋กœ ํ™œ๋™ ๋„คํŠธ์›Œํฌ๋กœ๋ถ€ํ„ฐ ์‚ฌ์šฉ์ž ๊ฐ„ ์‚ฌํšŒ์  ๊ด€๊ณ„๋ฅผ ์ถ”์ถœํ•˜๋Š” ์—ฐ๊ตฌ๋ฅผ ์ œ์•ˆํ•˜์˜€๋‹ค. ๊ธฐ์กด์— ์—ฐ๊ตฌ๋˜์—ˆ๋˜ ๋ฐฉ๋ฒ•๋“ค์€ ๋‘ ์‚ฌ์šฉ์ž๊ฐ€ ๋™์‹œ์— ๋ฐฉ๋ฌธํ•˜๋Š” ํ–‰์œ„์ธ co-visitation์„ ์ค‘์ ์ ์œผ๋กœ ๊ณ ๋ คํ•˜์—ฌ ์‚ฌ์šฉ์ž ๊ฐ„์˜ ๊ด€๊ณ„๋ฅผ ์˜ˆ์ธกํ•˜๊ฑฐ๋‚˜, ๋„คํŠธ์›Œํฌ ์ž„๋ฒ ๋”ฉ ๋˜๋Š” ๊ทธ๋ž˜ํ”„ ์‹ ๊ฒฝ๋ง(GNN)์„ ์‚ฌ์šฉํ•˜์—ฌ ํ‘œํ˜„ ํ•™์Šต์„ ์ˆ˜ํ–‰ํ•˜์˜€๋‹ค. ๊ทธ๋Ÿฌ๋‚˜ ์ด๋Ÿฌํ•œ ์ ‘๊ทผ ๋ฐฉ์‹์€ ์ฃผ๊ธฐ์ ์ธ ๋ฐฉ๋ฌธ์ด๋‚˜ ์žฅ๊ฑฐ๋ฆฌ ์ด๋™ ๋“ฑ์œผ๋กœ ๋Œ€ํ‘œ๋˜๋Š” ์‚ฌ์šฉ์ž์˜ ํ–‰๋™ ํŒจํ„ด์„ ์ž˜ ํฌ์ฐฉํ•˜์ง€ ๋ชปํ•œ๋‹ค. ํ–‰๋™ ํŒจํ„ด์„ ๋” ์ž˜ ํ•™์Šตํ•˜๊ธฐ ์œ„ํ•ด, ANES๋Š” ์‚ฌ์šฉ์ž ์ปจํ…์ŠคํŠธ ๋‚ด์—์„œ ์‚ฌ์šฉ์ž์™€ ๊ด€์‹ฌ ์ง€์ (POI) ๊ฐ„์˜ ์ธก๋ฉด(Aspect) ์ง€ํ–ฅ ๊ด€๊ณ„๋ฅผ ํ•™์Šตํ•œ๋‹ค. ANES๋Š” User-POI ์ด๋ถ„ ๊ทธ๋ž˜ํ”„์˜ ๊ตฌ์กฐ์—์„œ ์‚ฌ์šฉ์ž์˜ ํ–‰๋™์„ ์—ฌ๋Ÿฌ ๊ฐœ์˜ ์ธก๋ฉด์œผ๋กœ ๋‚˜๋ˆ„๊ณ , ๊ฐ๊ฐ์˜ ๊ด€๊ณ„๋ฅผ ๊ณ ๋ คํ•˜์—ฌ ํ–‰๋™ ํŒจํ„ด์„ ์ถ”์ถœํ•˜๋Š” ์ตœ์ดˆ์˜ ๋น„์ง€๋„ํ•™์Šต ๊ธฐ๋ฐ˜ ์ ‘๊ทผ ๋ฐฉ์‹์ด๋‹ค. ์‹ค์ œ LBSN ๋ฐ์ดํ„ฐ์—์„œ ์ˆ˜ํ–‰๋œ ๊ด‘๋ฒ”์œ„ํ•œ ์‹คํ—˜์—์„œ, ANES๋Š” ๊ธฐ์กด์— ์ œ์•ˆ๋˜์—ˆ๋˜ ๊ธฐ๋ฒ•๋“ค๋ณด๋‹ค ๋†’์€ ์„ฑ๋Šฅ์„ ๋ณด์—ฌ์ค€๋‹ค. ์œ„์น˜ ๊ธฐ๋ฐ˜ ์†Œ์…œ ๋„คํŠธ์›Œํฌ์™€๋Š” ๋‹ค๋ฅด๊ฒŒ, ์ด์ปค๋จธ์Šค์˜ ๋ฆฌ๋ทฐ ์‹œ์Šคํ…œ์—์„œ๋Š” ์‚ฌ์šฉ์ž๋“ค์ด ๋Šฅ๋™์ ์ธ ํŒ”๋กœ์šฐ/ํŒ”๋กœ์ž‰ ๋“ฑ์˜ ํ–‰์œ„๋ฅผ ์ˆ˜ํ–‰ํ•˜์ง€ ์•Š๊ณ ๋„ ํ”Œ๋žซํผ์— ์˜ํ•ด ์„œ๋กœ์˜ ์ •๋ณด๋ฅผ ์ฃผ๊ณ ๋ฐ›๊ณ  ์˜ํ–ฅ๋ ฅ์„ ํ–‰์‚ฌํ•˜๊ฒŒ ๋œ๋‹ค. ์ด์™€ ๊ฐ™์€ ์‚ฌ์šฉ์ž๋“ค์˜ ํ–‰๋™ ํŠน์„ฑ์€ ๋ฆฌ๋ทฐ ์ŠคํŒธ์— ์˜ํ•ด ์‰ฝ๊ฒŒ ์•…์šฉ๋  ์ˆ˜ ์žˆ๋‹ค. ๋ฆฌ๋ทฐ ์ŠคํŒธ์€ ์‹ค์ œ ์‚ฌ์šฉ์ž์˜ ์˜๊ฒฌ์„ ์ˆจ๊ธฐ๊ณ  ํ‰์ ์„ ์กฐ์ž‘ํ•˜์—ฌ ์ž˜๋ชป๋œ ์ •๋ณด๋ฅผ ์ „๋‹ฌํ•˜๋Š” ๋ฐฉ์‹์œผ๋กœ ์ด๋ฃจ์–ด์ง„๋‹ค. ๋‚˜๋Š” ์ด๋ฅผ ํ•ด๊ฒฐํ•˜๊ธฐ ์œ„ํ•ด ์‚ฌ์šฉ์ž ๋ฆฌ๋ทฐ ๋ฐ์ดํ„ฐ์—์„œ ์‚ฌ์šฉ์ž ๊ฐ„ ์‚ฌ์ „ ๊ณต๋ชจ์„ฑ(Collusiveness)์˜ ๊ฐ€๋Šฅ์„ฑ์„ ์ฐพ๊ณ , ์ด๋ฅผ ์ŠคํŒธ ํƒ์ง€์— ํ™œ์šฉํ•œ ๋ฐฉ๋ฒ•์ธ SC-Com์„ ์ œ์•ˆํ•œ๋‹ค. SC-Com์€ ํ–‰๋™์˜ ๊ณต๋ชจ์„ฑ์œผ๋กœ๋ถ€ํ„ฐ ์‚ฌ์šฉ์ž ๊ฐ„ ๊ณต๋ชจ ์ ์ˆ˜๋ฅผ ๊ณ„์‚ฐํ•˜๊ณ  ํ•ด๋‹น ์ ์ˆ˜๋ฅผ ๋ฐ”ํƒ•์œผ๋กœ ์ „์ฒด ์‚ฌ์šฉ์ž๋ฅผ ์œ ์‚ฌํ•œ ์‚ฌ์šฉ์ž๋“ค์˜ ์ปค๋ฎค๋‹ˆํ‹ฐ๋กœ ๋ถ„๋ฅ˜ํ•œ๋‹ค. ๊ทธ ํ›„ ์ŠคํŒธ ์œ ์ €์™€ ์ผ๋ฐ˜ ์œ ์ €๋ฅผ ๊ตฌ๋ณ„ํ•˜๋Š” ๋ฐ์— ์ค‘์š”ํ•œ ๊ทธ๋ž˜ํ”„ ๊ธฐ๋ฐ˜์˜ ํŠน์ง•์„ ์ถ”์ถœํ•˜์—ฌ ๊ฐ๋… ํ•™์Šต ๊ธฐ๋ฐ˜์˜ ๋ถ„๋ฅ˜๊ธฐ์˜ ์ž…๋ ฅ ๋ฐ์ดํ„ฐ๋กœ ํ™œ์šฉํ•˜๋Š” ๋ฐฉ๋ฒ•์„ ์ œ์‹œํ•œ๋‹ค. SC-Com์€ ๊ณต๋ชจ์„ฑ์„ ๊ฐ–๋Š” ์ŠคํŒธ ์œ ์ €์˜ ์ง‘ํ•ฉ์„ ํšจ๊ณผ์ ์œผ๋กœ ํƒ์ง€ํ•œ๋‹ค. ์‹ค์ œ ๋ฐ์ดํ„ฐ์…‹์„ ์ด์šฉํ•œ ์‹คํ—˜์—์„œ, SC-Com์€ ๊ธฐ์กด ๋…ผ๋ฌธ๋“ค ๋Œ€๋น„ ์ŠคํŒธ ํƒ์ง€์— ๋›ฐ์–ด๋‚œ ์„ฑ๋Šฅ์„ ๋ณด์—ฌ์ฃผ์—ˆ๋‹ค. ์œ„ ๋…ผ๋ฌธ์—์„œ ๋‹ค์–‘ํ•œ ๋ฐ์ดํ„ฐ์— ๋Œ€ํ•ด ์—ฐ๊ตฌ๋œ ์•”์‹œ์  ์—ฐ๊ฒฐ๋ง ํƒ์ง€ ๋ชจ๋ธ์€ ๋ ˆ์ด๋ธ”์ด ์—†๋Š” ๋ฐ์ดํ„ฐ์— ๋Œ€ํ•ด์„œ๋„ ์‚ฌ์ „์— ์—ฐ๊ฒฐ๋˜์—ˆ์„ ๊ฐ€๋Šฅ์„ฑ์ด ๋†’์€ ์‚ฌ์šฉ์ž๋“ค์„ ์˜ˆ์ธกํ•˜๋ฏ€๋กœ, ์‹ค์‹œ๊ฐ„ ์œ„์น˜ ๋ฐ์ดํ„ฐ๋‚˜, ์•ฑ ์‚ฌ์šฉ ๋ฐ์ดํ„ฐ ๋“ฑ์˜ ๋‹ค์–‘ํ•œ ๋ฐ์ดํ„ฐ์—์„œ ํ™œ์šฉํ•  ์ˆ˜ ์žˆ๋Š” ์œ ์šฉํ•œ ์ •๋ณด๋ฅผ ์ œ๊ณตํ•˜์—ฌ ๊ด‘๊ณ  ์ถ”์ฒœ ์‹œ์Šคํ…œ์ด๋‚˜, ์•…์„ฑ ์œ ์ € ํƒ์ง€ ๋“ฑ์˜ ๋ถ„์•ผ์—์„œ ๊ธฐ์—ฌํ•  ์ˆ˜ ์žˆ์„ ๊ฒƒ์œผ๋กœ ๊ธฐ๋Œ€ํ•œ๋‹ค.Following the exploding usage on online services, people are connected with each other more broadly and widely. In online platforms, people influence each other, and have tendency to reflect their opinions in decision-making. Social Network Services (SNSs) and E-commerce are typical example of online platforms. User behaviors in online platforms can be defined as relation between user and platform components. A user's purchase is a relationship between a user and a product, and a user's check-in is a relationship between a user and a place. Here, information such as action time, rating, tag, etc. may be included. In many studies, platform user behavior is represented in graph form. At this time, the elements constituting the nodes of the graph are composed of objects such as users and products and places within the platform, and the interaction between the platform elements and the user can be expressed as two nodes being connected. In this study, I present studies to identify potential networks that affect the user's behavior graph defined on the two platforms. In ANES, I focus on representation learning for social link inference based on user trajectory data. While traditional methods predict relations between users by considering hand-crafted features, recent studies first perform representation learning using network/node embedding or graph neural networks (GNNs) for downstream tasks such as node classification and link prediction. However, those approaches fail to capture behavioral patterns of individuals ingrained in periodical visits or long-distance movements. To better learn behavioral patterns, this paper proposes a novel scheme called ANES (Aspect-oriented Network Embedding for Social link inference). ANES learns aspect-oriented relations between users and Point-of-Interests (POIs) within their contexts. ANES is the first approach that extracts the complex behavioral pattern of users from both trajectory data and the structure of User-POI bipartite graphs. Extensive experiments on several real-world datasets show that ANES outperforms state-of-the-art baselines. In contrast to active social networks, people are connected to other users regardless of their intentions in some platforms, such as online shopping websites and restaurant review sites. They do not have any information about each other in advance, and they only have a common point which is that they have visited or have planned to visit same place or purchase a product. Interestingly, users have tendency to be influenced by the review data on their purchase intentions. Unfortunately, this instinct is easily exploited by opinion spammers. In SC-Com, I focus on opinion spam detection in online shopping services. In many cases, my decision-making process is closely related to online reviews. However, there have been threats of opinion spams by hired reviewers increasingly, which aim to mislead potential customers by hiding genuine consumers opinions. Opinion spams should be filed up collectively to falsify true information. Fortunately, I propose the way to spot the possibility to detect them from their collusiveness. In this paper, I propose SC-Com, an optimized collusive community detection framework. It constructs the graph of reviewers from the collusiveness of behavior and divides a graph by communities based on their mutual suspiciousness. After that, I extract community-based and temporal abnormality features which are critical to discriminate spammers from other genuine users. I show that my method detects collusive opinion spam reviewers effectively and precisely from their collective behavioral patterns. In the real-world dataset, my approach showed prominent performance while only considering primary data such as time and ratings. These implicit network inference models studied on various data in this thesis predicts users who are likely to be pre-connected to unlabeled data, so it is expected to contribute to areas such as advertising recommendation systems and malicious user detection by providing useful information.Chapter 1 Introduction 1 Chapter 2 Social link Inference in Location-based check-in data 5 2.1 Background 5 2.2 Related Work 12 2.3 Location-based Social Network Service Data 15 2.4 Aspect-wise Graph Decomposition 18 2.5 Aspect-wise Graph learning 19 2.6 Inferring Social Relation from User Representation 21 2.7 Performance Analysis 23 2.8 Discussion and Implications 26 2.9 Summary 34 Chapter 3 Detecting collusiveness from reviews in Online platforms and its application 35 3.1 Background 35 3.2 Related Work 39 3.3 Online Review Data 43 3.4 Collusive Graph Projection 44 3.5 Reviewer Community Detection 47 3.6 Review Community feature extraction and spammer detection 51 3.7 Performance Analysis 53 3.8 Discussion and Implications 55 3.9 Summary 62 Chapter 4 Conclusion 63๋ฐ•
    • โ€ฆ
    corecore