85 research outputs found
FairGen: Towards Fair Graph Generation
There have been tremendous efforts over the past decades dedicated to the
generation of realistic graphs in a variety of domains, ranging from social
networks to computer networks, from gene regulatory networks to online
transaction networks. Despite the remarkable success, the vast majority of
these works are unsupervised in nature and are typically trained to minimize
the expected graph reconstruction loss, which would result in the
representation disparity issue in the generated graphs, i.e., the protected
groups (often minorities) contribute less to the objective and thus suffer from
systematically higher errors. In this paper, we aim to tailor graph generation
to downstream mining tasks by leveraging label information and user-preferred
parity constraint. In particular, we start from the investigation of
representation disparity in the context of graph generative models. To mitigate
the disparity, we propose a fairness-aware graph generative model named
FairGen. Our model jointly trains a label-informed graph generation module and
a fair representation learning module by progressively learning the behaviors
of the protected and unprotected groups, from the `easy' concepts to the `hard'
ones. In addition, we propose a generic context sampling strategy for graph
generative models, which is proven to be capable of fairly capturing the
contextual information of each group with a high probability. Experimental
results on seven real-world data sets, including web-based graphs, demonstrate
that FairGen (1) obtains performance on par with state-of-the-art graph
generative models across six network properties, (2) mitigates the
representation disparity issues in the generated graphs, and (3) substantially
boosts the model performance by up to 17% in downstream tasks via data
augmentation
An Inferable Representation Learning for Fraud Review Detection with Cold-start Problem
ยฉ 2019 IEEE. Fraud review significantly damages the business reputation and also customers' trust to certain products. It has become a serious problem existing on the current social media. Various efforts have been put in to tackle such problems. However, in the case of cold-start where a review is posted by a new user who just pops up on the social media, common fraud detection methods may fail because most of them are heavily depended on the information about the user's historical behavior and its social relation to other users, yet such information is lacking in the cold-start case. This paper presents a novel Joint-bEhavior-and-Social-relaTion-infERable (JESTER) embedding method to leverage the user reviewing behavior and social relations for cold-start fraud review detection. JESTER embeds the deep characteristics of existing user behavior and social relations of users and items in an inferable user-item-review-rating representation space where the representation of a new user can be efficiently inferred by a closed-form solution and reflects the user's most probable behavior and social relations. Thus, a cold-start fraud review can be effectively detected accordingly. Our experiments show JESTER (i) performs significantly better in detecting fraud reviews on four real-life social media data sets, and (ii) effectively infers new user representation in the cold-start problem, compared to three state-of-the-art and two baseline competitors
Search Rank Fraud Prevention in Online Systems
The survival of products in online services such as Google Play, Yelp, Facebook and Amazon, is contingent on their search rank. This, along with the social impact of such services, has also turned them into a lucrative medium for fraudulently influencing public opinion. Motivated by the need to aggressively promote products, communities that specialize in social network fraud (e.g., fake opinions and reviews, likes, followers, app installs) have emerged, to create a black market for fraudulent search optimization. Fraudulent product developers exploit these communities to hire teams of workers willing and able to commit fraud collectively, emulating realistic, spontaneous activities from unrelated people. We call this behavior โsearch rank fraudโ. In this dissertation, we argue that fraud needs to be proactively discouraged and prevented, instead of only reactively detected and filtered. We introduce two novel approaches to discourage search rank fraud in online systems. First, we detect fraud in real-time, when it is posted, and impose resource consuming penalties on the devices that post activities. We introduce and leverage several novel concepts that include (i) stateless, verifiable computational puzzles that impose minimal performance overhead, but enable the efficient verification of their authenticity, (ii) a real-time, graph based solution to assign fraud scores to user activities, and (iii) mechanisms to dynamically adjust puzzle difficulty levels based on fraud scores and the computational capabilities of devices. In a second approach, we introduce the problem of fraud de-anonymization: reveal the crowdsourcing site accounts of the people who post large amounts of fraud, thus their bank accounts, and provide compelling evidence of fraud to the users of products that they promote. We investigate the ability of our solutions to ensure that fraud does not pay off
Combating Threats to the Quality of Information in Social Systems
Many large-scale social systems such as Web-based social networks, online social media sites and Web-scale crowdsourcing systems have been growing rapidly, enabling millions of human participants to generate, share and consume content on a massive scale. This reliance on users can lead to many positive effects, including large-scale growth in the size and content in the community, bottom-up discovery of โcitizen-expertsโ, serendipitous discovery of new resources beyond the scope of the system designers, and new social-based information search and retrieval algorithms. But the relative openness and reliance on users coupled with the widespread interest and growth of these social systems carries risks and raises growing concerns over the quality of information in these systems.
In this dissertation research, we focus on countering threats to the quality of information in self-managing social systems. Concretely, we identify three classes of threats to these systems: (i) content pollution by social spammers, (ii) coordinated campaigns for strategic manipulation, and (iii) threats to collective attention. To combat these threats, we propose three inter-related methods for detecting evidence of these threats, mitigating their impact, and improving the quality of information in social systems. We augment this three-fold defense with an exploration of their origins in โcrowdturfingโ โ a sinister counterpart to the enormous positive opportunities of crowdsourcing. In particular, this dissertation research makes four unique contributions:
โข The first contribution of this dissertation research is a framework for detecting and filtering social spammers and content polluters in social systems. To detect and filter individual social spammers and content polluters, we propose and evaluate a novel social honeypot-based approach.
โข Second, we present a set of methods and algorithms for detecting coordinated campaigns in large-scale social systems. We propose and evaluate a content- driven framework for effectively linking free text posts with common โtalking pointsโ and extracting campaigns from large-scale social systems.
โข Third, we present a dual study of the robustness of social systems to collective attention threats through both a data-driven modeling approach and deploy- ment over a real system trace. We evaluate the effectiveness of countermeasures deployed based on the first moments of a bursting phenomenon in a real system.
โข Finally, we study the underlying ecosystem of crowdturfing for engaging in each of the three threat types. We present a framework for โpulling back the curtainโ on crowdturfers to reveal their underlying ecosystem on both crowdsourcing sites and social media
์์ ๋คํธ์ํฌ์ ์ด์ปค๋จธ์ค ํ๋ซํผ์์์ ์ ์ฌ ๋คํธ์ํฌ ๋ง์ด๋
ํ์๋
ผ๋ฌธ(๋ฐ์ฌ) -- ์์ธ๋ํ๊ต๋ํ์ : ๊ณต๊ณผ๋ํ ์ปดํจํฐ๊ณตํ๋ถ, 2023. 2. ๊ถํ๊ฒฝ.์น ๊ธฐ๋ฐ ์๋น์ค์ ํญ๋ฐ์ ์ธ ๋ฐ๋ฌ๋ก ์ฌ์ฉ์๋ค์ ์จ๋ผ์ธ ์์์ ํญ๋๊ฒ ์ฐ๊ฒฐ๋๊ณ ์๋ค. ์จ๋ผ์ธ ํ๋ซํผ ์์์, ์ฌ์ฉ์๋ค์ ์๋ก์๊ฒ ์ํฅ์ ์ฃผ๊ณ ๋ฐ์ผ๋ฉฐ ์์ฌ ๊ฒฐ์ ์ ๊ทธ๋ค์ ๊ฒฝํ๊ณผ ์๊ฒฌ์ ๋ฐ์ํ๋ ๊ฒฝํฅ์ ๋ณด์ธ๋ค. ๋ณธ ํ์ ๋
ผ๋ฌธ์์๋ ๋ํ์ ์ธ ์จ๋ผ์ธ ํ๋ซํผ์ธ ์์
๋คํธ์ํฌ ์๋น์ค์ ์ด์ปค๋จธ์ค ํ๋ซํผ์์์ ์ฌ์ฉ์ ํ๋์ ๋ํด ์ฐ๊ตฌํ์๋ค.
์จ๋ผ์ธ ํ๋ซํผ์์์ ์ฌ์ฉ์ ํ๋์ ์ฌ์ฉ์์ ํ๋ซํผ ๊ตฌ์ฑ ์์ ๊ฐ์ ๊ด๊ณ๋ก ํํํ ์ ์๋ค. ์ฌ์ฉ์์ ๊ตฌ๋งค๋ ์ฌ์ฉ์์ ์ํ ๊ฐ์ ๊ด๊ณ๋ก, ์ฌ์ฉ์์ ์ฒดํฌ์ธ์ ์ฌ์ฉ์์ ์ฅ์ ๊ฐ์ ๊ด๊ณ๋ก ๋ํ๋ด์ง๋ค. ์ฌ๊ธฐ์ ํ๋์ ์๊ฐ๊ณผ ๋ ์ดํ
, ํ๊ทธ ๋ฑ์ ์ ๋ณด๊ฐ ํฌํจ๋ ์ ์๋ค.
๋ณธ ์ฐ๊ตฌ์์๋ ๋ ํ๋ซํผ์์ ์ ์๋ ์ฌ์ฉ์์ ํ๋ ๊ทธ๋ํ์ ์ํฅ์ ๋ฏธ์น๋ ์ ์ฌ ๋คํธ์ํฌ๋ฅผ ํ์
ํ๋ ์ฐ๊ตฌ๋ฅผ ์ ์ํ๋ค. ์์น ๊ธฐ๋ฐ์ ์์
๋คํธ์ํฌ ์๋น์ค์ ๊ฒฝ์ฐ ํน์ ์ฅ์์ ๋ฐฉ๋ฌธํ๋ ์ฒดํฌ์ธ ํ์์ผ๋ก ๋ง์ ํฌ์คํธ๊ฐ ๋ง๋ค์ด์ง๋๋ฐ, ์ฌ์ฉ์์ ์ฅ์ ๋ฐฉ๋ฌธ์ ์ฌ์ฉ์ ๊ฐ์ ์ฌ์ ์ ์กด์ฌํ๋ ์น๊ตฌ ๊ด๊ณ์ ์ํด ์ํฅ์ ํฌ๊ฒ ๋ฐ๋๋ค. ์ฌ์ฉ์ ํ๋ ๋คํธ์ํฌ์ ์ ๋ณ์ ์ ์ฌ๋ ์ฌ์ฉ์ ๊ฐ์ ๊ด๊ณ๋ฅผ ํ์
ํ๋ ๊ฒ์ ํ๋ ์์ธก์ ๋์์ด ๋ ์ ์์ผ๋ฉฐ, ์ด๋ฅผ ์ํด ๋ณธ ๋
ผ๋ฌธ์์๋ ๋น์ง๋ํ์ต ๊ธฐ๋ฐ์ผ๋ก ํ๋ ๋คํธ์ํฌ๋ก๋ถํฐ ์ฌ์ฉ์ ๊ฐ ์ฌํ์ ๊ด๊ณ๋ฅผ ์ถ์ถํ๋ ์ฐ๊ตฌ๋ฅผ ์ ์ํ์๋ค.
๊ธฐ์กด์ ์ฐ๊ตฌ๋์๋ ๋ฐฉ๋ฒ๋ค์ ๋ ์ฌ์ฉ์๊ฐ ๋์์ ๋ฐฉ๋ฌธํ๋ ํ์์ธ co-visitation์ ์ค์ ์ ์ผ๋ก ๊ณ ๋ คํ์ฌ ์ฌ์ฉ์ ๊ฐ์ ๊ด๊ณ๋ฅผ ์์ธกํ๊ฑฐ๋, ๋คํธ์ํฌ ์๋ฒ ๋ฉ ๋๋ ๊ทธ๋ํ ์ ๊ฒฝ๋ง(GNN)์ ์ฌ์ฉํ์ฌ ํํ ํ์ต์ ์ํํ์๋ค. ๊ทธ๋ฌ๋ ์ด๋ฌํ ์ ๊ทผ ๋ฐฉ์์ ์ฃผ๊ธฐ์ ์ธ ๋ฐฉ๋ฌธ์ด๋ ์ฅ๊ฑฐ๋ฆฌ ์ด๋ ๋ฑ์ผ๋ก ๋ํ๋๋ ์ฌ์ฉ์์ ํ๋ ํจํด์ ์ ํฌ์ฐฉํ์ง ๋ชปํ๋ค. ํ๋ ํจํด์ ๋ ์ ํ์ตํ๊ธฐ ์ํด, ANES๋ ์ฌ์ฉ์ ์ปจํ
์คํธ ๋ด์์ ์ฌ์ฉ์์ ๊ด์ฌ ์ง์ (POI) ๊ฐ์ ์ธก๋ฉด(Aspect) ์งํฅ ๊ด๊ณ๋ฅผ ํ์ตํ๋ค. ANES๋ User-POI ์ด๋ถ ๊ทธ๋ํ์ ๊ตฌ์กฐ์์ ์ฌ์ฉ์์ ํ๋์ ์ฌ๋ฌ ๊ฐ์ ์ธก๋ฉด์ผ๋ก ๋๋๊ณ , ๊ฐ๊ฐ์ ๊ด๊ณ๋ฅผ ๊ณ ๋ คํ์ฌ ํ๋ ํจํด์ ์ถ์ถํ๋ ์ต์ด์ ๋น์ง๋ํ์ต ๊ธฐ๋ฐ ์ ๊ทผ ๋ฐฉ์์ด๋ค. ์ค์ LBSN ๋ฐ์ดํฐ์์ ์ํ๋ ๊ด๋ฒ์ํ ์คํ์์, ANES๋ ๊ธฐ์กด์ ์ ์๋์๋ ๊ธฐ๋ฒ๋ค๋ณด๋ค ๋์ ์ฑ๋ฅ์ ๋ณด์ฌ์ค๋ค.
์์น ๊ธฐ๋ฐ ์์
๋คํธ์ํฌ์๋ ๋ค๋ฅด๊ฒ, ์ด์ปค๋จธ์ค์ ๋ฆฌ๋ทฐ ์์คํ
์์๋ ์ฌ์ฉ์๋ค์ด ๋ฅ๋์ ์ธ ํ๋ก์ฐ/ํ๋ก์ ๋ฑ์ ํ์๋ฅผ ์ํํ์ง ์๊ณ ๋ ํ๋ซํผ์ ์ํด ์๋ก์ ์ ๋ณด๋ฅผ ์ฃผ๊ณ ๋ฐ๊ณ ์ํฅ๋ ฅ์ ํ์ฌํ๊ฒ ๋๋ค. ์ด์ ๊ฐ์ ์ฌ์ฉ์๋ค์ ํ๋ ํน์ฑ์ ๋ฆฌ๋ทฐ ์คํธ์ ์ํด ์ฝ๊ฒ ์
์ฉ๋ ์ ์๋ค. ๋ฆฌ๋ทฐ ์คํธ์ ์ค์ ์ฌ์ฉ์์ ์๊ฒฌ์ ์จ๊ธฐ๊ณ ํ์ ์ ์กฐ์ํ์ฌ ์๋ชป๋ ์ ๋ณด๋ฅผ ์ ๋ฌํ๋ ๋ฐฉ์์ผ๋ก ์ด๋ฃจ์ด์ง๋ค. ๋๋ ์ด๋ฅผ ํด๊ฒฐํ๊ธฐ ์ํด ์ฌ์ฉ์ ๋ฆฌ๋ทฐ ๋ฐ์ดํฐ์์ ์ฌ์ฉ์ ๊ฐ ์ฌ์ ๊ณต๋ชจ์ฑ(Collusiveness)์ ๊ฐ๋ฅ์ฑ์ ์ฐพ๊ณ , ์ด๋ฅผ ์คํธ ํ์ง์ ํ์ฉํ ๋ฐฉ๋ฒ์ธ SC-Com์ ์ ์ํ๋ค. SC-Com์ ํ๋์ ๊ณต๋ชจ์ฑ์ผ๋ก๋ถํฐ ์ฌ์ฉ์ ๊ฐ ๊ณต๋ชจ ์ ์๋ฅผ ๊ณ์ฐํ๊ณ ํด๋น ์ ์๋ฅผ ๋ฐํ์ผ๋ก ์ ์ฒด ์ฌ์ฉ์๋ฅผ ์ ์ฌํ ์ฌ์ฉ์๋ค์ ์ปค๋ฎค๋ํฐ๋ก ๋ถ๋ฅํ๋ค. ๊ทธ ํ ์คํธ ์ ์ ์ ์ผ๋ฐ ์ ์ ๋ฅผ ๊ตฌ๋ณํ๋ ๋ฐ์ ์ค์ํ ๊ทธ๋ํ ๊ธฐ๋ฐ์ ํน์ง์ ์ถ์ถํ์ฌ ๊ฐ๋
ํ์ต ๊ธฐ๋ฐ์ ๋ถ๋ฅ๊ธฐ์ ์
๋ ฅ ๋ฐ์ดํฐ๋ก ํ์ฉํ๋ ๋ฐฉ๋ฒ์ ์ ์ํ๋ค. SC-Com์ ๊ณต๋ชจ์ฑ์ ๊ฐ๋ ์คํธ ์ ์ ์ ์งํฉ์ ํจ๊ณผ์ ์ผ๋ก ํ์งํ๋ค. ์ค์ ๋ฐ์ดํฐ์
์ ์ด์ฉํ ์คํ์์, SC-Com์ ๊ธฐ์กด ๋
ผ๋ฌธ๋ค ๋๋น ์คํธ ํ์ง์ ๋ฐ์ด๋ ์ฑ๋ฅ์ ๋ณด์ฌ์ฃผ์๋ค.
์ ๋
ผ๋ฌธ์์ ๋ค์ํ ๋ฐ์ดํฐ์ ๋ํด ์ฐ๊ตฌ๋ ์์์ ์ฐ๊ฒฐ๋ง ํ์ง ๋ชจ๋ธ์ ๋ ์ด๋ธ์ด ์๋ ๋ฐ์ดํฐ์ ๋ํด์๋ ์ฌ์ ์ ์ฐ๊ฒฐ๋์์ ๊ฐ๋ฅ์ฑ์ด ๋์ ์ฌ์ฉ์๋ค์ ์์ธกํ๋ฏ๋ก, ์ค์๊ฐ ์์น ๋ฐ์ดํฐ๋, ์ฑ ์ฌ์ฉ ๋ฐ์ดํฐ ๋ฑ์ ๋ค์ํ ๋ฐ์ดํฐ์์ ํ์ฉํ ์ ์๋ ์ ์ฉํ ์ ๋ณด๋ฅผ ์ ๊ณตํ์ฌ ๊ด๊ณ ์ถ์ฒ ์์คํ
์ด๋, ์
์ฑ ์ ์ ํ์ง ๋ฑ์ ๋ถ์ผ์์ ๊ธฐ์ฌํ ์ ์์ ๊ฒ์ผ๋ก ๊ธฐ๋ํ๋ค.Following the exploding usage on online services, people are connected with each other more broadly and widely. In online platforms, people influence each other, and have tendency to reflect their opinions in decision-making. Social Network Services (SNSs) and E-commerce are typical example of online platforms.
User behaviors in online platforms can be defined as relation between user and platform components. A user's purchase is a relationship between a user and a product, and a user's check-in is a relationship between a user and a place. Here, information such as action time, rating, tag, etc. may be included. In many studies, platform user behavior is represented in graph form. At this time, the elements constituting the nodes of the graph are composed of objects such as users and products and places within the platform, and the interaction between the platform elements and the user can be expressed as two nodes being connected.
In this study, I present studies to identify potential networks that affect the user's behavior graph defined on the two platforms.
In ANES, I focus on representation learning for social link inference based on user trajectory data. While traditional methods predict relations between users by considering hand-crafted features, recent studies first perform representation learning using network/node embedding or graph neural networks (GNNs) for downstream tasks such as node classification and link prediction. However, those approaches fail to capture behavioral patterns of individuals ingrained in periodical visits or long-distance movements. To better learn behavioral patterns, this paper proposes a novel scheme called ANES (Aspect-oriented Network Embedding for Social link inference). ANES learns aspect-oriented relations between users and Point-of-Interests (POIs) within their contexts. ANES is the first approach that extracts the complex behavioral pattern of users from both trajectory data and the structure of User-POI bipartite graphs. Extensive experiments on several real-world datasets show that ANES outperforms state-of-the-art baselines.
In contrast to active social networks, people are connected to other users regardless of their intentions in some platforms, such as online shopping websites and restaurant review sites. They do not have any information about each other in advance, and they only have a common point which is that they have visited or have planned to visit same place or purchase a product. Interestingly, users have tendency to be influenced by the review data on their purchase intentions.
Unfortunately, this instinct is easily exploited by opinion spammers. In SC-Com, I focus on opinion spam detection in online shopping services. In many cases, my decision-making process is closely related to online reviews. However, there have been threats of opinion spams by hired reviewers increasingly, which aim to mislead potential customers by hiding genuine consumers opinions. Opinion spams should be filed up collectively to falsify true information. Fortunately, I propose the way to spot the possibility to detect them from their collusiveness. In this paper, I propose SC-Com, an optimized collusive community detection framework. It constructs the graph of reviewers from the collusiveness of behavior and divides a graph by communities based on their mutual suspiciousness. After that, I extract community-based and temporal abnormality features which are critical to discriminate spammers from other genuine users. I show that my method detects collusive opinion spam reviewers effectively and precisely from their collective behavioral patterns. In the real-world dataset, my approach showed prominent performance while only considering primary data such as time and ratings.
These implicit network inference models studied on various data in this thesis predicts users who are likely to be pre-connected to unlabeled data, so it is expected to contribute to areas such as advertising recommendation systems and malicious user detection by providing useful information.Chapter 1 Introduction 1
Chapter 2 Social link Inference in Location-based check-in data 5
2.1 Background 5
2.2 Related Work 12
2.3 Location-based Social Network Service Data 15
2.4 Aspect-wise Graph Decomposition 18
2.5 Aspect-wise Graph learning 19
2.6 Inferring Social Relation from User Representation 21
2.7 Performance Analysis 23
2.8 Discussion and Implications 26
2.9 Summary 34
Chapter 3 Detecting collusiveness from reviews in Online platforms and its application 35
3.1 Background 35
3.2 Related Work 39
3.3 Online Review Data 43
3.4 Collusive Graph Projection 44
3.5 Reviewer Community Detection 47
3.6 Review Community feature extraction and spammer detection 51
3.7 Performance Analysis 53
3.8 Discussion and Implications 55
3.9 Summary 62
Chapter 4 Conclusion 63๋ฐ
Anti-fragile ICT Systems
This book introduces a novel approach to the design and operation of large ICT systems. It views the technical solutions and their stakeholders as complex adaptive systems and argues that traditional risk analyses cannot predict all future incidents with major impacts. To avoid unacceptable events, it is necessary to establish and operate anti-fragile ICT systems that limit the impact of all incidents, and which learn from small-impact incidents how to function increasingly well in changing environments. The book applies four design principles and one operational principle to achieve anti-fragility for different classes of incidents. It discusses how systems can achieve high availability, prevent malware epidemics, and detect anomalies. Analyses of Netflixโs media streaming solution, Norwegian telecom infrastructures, e-government platforms, and Numentaโs anomaly detection software show that cloud computing is essential to achieving anti-fragility for classes of events with negative impacts
Topology Reconstruction of Dynamical Networks via Constrained Lyapunov Equations
The network structure (or topology) of a dynamical network is often
unavailable or uncertain. Hence, we consider the problem of network
reconstruction. Network reconstruction aims at inferring the topology of a
dynamical network using measurements obtained from the network. In this
technical note we define the notion of solvability of the network
reconstruction problem. Subsequently, we provide necessary and sufficient
conditions under which the network reconstruction problem is solvable. Finally,
using constrained Lyapunov equations, we establish novel network reconstruction
algorithms, applicable to general dynamical networks. We also provide
specialized algorithms for specific network dynamics, such as the well-known
consensus and adjacency dynamics.Comment: 8 page
- โฆ