25 research outputs found
์์ ๋คํธ์ํฌ์ ์ด์ปค๋จธ์ค ํ๋ซํผ์์์ ์ ์ฌ ๋คํธ์ํฌ ๋ง์ด๋
ํ์๋
ผ๋ฌธ(๋ฐ์ฌ) -- ์์ธ๋ํ๊ต๋ํ์ : ๊ณต๊ณผ๋ํ ์ปดํจํฐ๊ณตํ๋ถ, 2023. 2. ๊ถํ๊ฒฝ.์น ๊ธฐ๋ฐ ์๋น์ค์ ํญ๋ฐ์ ์ธ ๋ฐ๋ฌ๋ก ์ฌ์ฉ์๋ค์ ์จ๋ผ์ธ ์์์ ํญ๋๊ฒ ์ฐ๊ฒฐ๋๊ณ ์๋ค. ์จ๋ผ์ธ ํ๋ซํผ ์์์, ์ฌ์ฉ์๋ค์ ์๋ก์๊ฒ ์ํฅ์ ์ฃผ๊ณ ๋ฐ์ผ๋ฉฐ ์์ฌ ๊ฒฐ์ ์ ๊ทธ๋ค์ ๊ฒฝํ๊ณผ ์๊ฒฌ์ ๋ฐ์ํ๋ ๊ฒฝํฅ์ ๋ณด์ธ๋ค. ๋ณธ ํ์ ๋
ผ๋ฌธ์์๋ ๋ํ์ ์ธ ์จ๋ผ์ธ ํ๋ซํผ์ธ ์์
๋คํธ์ํฌ ์๋น์ค์ ์ด์ปค๋จธ์ค ํ๋ซํผ์์์ ์ฌ์ฉ์ ํ๋์ ๋ํด ์ฐ๊ตฌํ์๋ค.
์จ๋ผ์ธ ํ๋ซํผ์์์ ์ฌ์ฉ์ ํ๋์ ์ฌ์ฉ์์ ํ๋ซํผ ๊ตฌ์ฑ ์์ ๊ฐ์ ๊ด๊ณ๋ก ํํํ ์ ์๋ค. ์ฌ์ฉ์์ ๊ตฌ๋งค๋ ์ฌ์ฉ์์ ์ํ ๊ฐ์ ๊ด๊ณ๋ก, ์ฌ์ฉ์์ ์ฒดํฌ์ธ์ ์ฌ์ฉ์์ ์ฅ์ ๊ฐ์ ๊ด๊ณ๋ก ๋ํ๋ด์ง๋ค. ์ฌ๊ธฐ์ ํ๋์ ์๊ฐ๊ณผ ๋ ์ดํ
, ํ๊ทธ ๋ฑ์ ์ ๋ณด๊ฐ ํฌํจ๋ ์ ์๋ค.
๋ณธ ์ฐ๊ตฌ์์๋ ๋ ํ๋ซํผ์์ ์ ์๋ ์ฌ์ฉ์์ ํ๋ ๊ทธ๋ํ์ ์ํฅ์ ๋ฏธ์น๋ ์ ์ฌ ๋คํธ์ํฌ๋ฅผ ํ์
ํ๋ ์ฐ๊ตฌ๋ฅผ ์ ์ํ๋ค. ์์น ๊ธฐ๋ฐ์ ์์
๋คํธ์ํฌ ์๋น์ค์ ๊ฒฝ์ฐ ํน์ ์ฅ์์ ๋ฐฉ๋ฌธํ๋ ์ฒดํฌ์ธ ํ์์ผ๋ก ๋ง์ ํฌ์คํธ๊ฐ ๋ง๋ค์ด์ง๋๋ฐ, ์ฌ์ฉ์์ ์ฅ์ ๋ฐฉ๋ฌธ์ ์ฌ์ฉ์ ๊ฐ์ ์ฌ์ ์ ์กด์ฌํ๋ ์น๊ตฌ ๊ด๊ณ์ ์ํด ์ํฅ์ ํฌ๊ฒ ๋ฐ๋๋ค. ์ฌ์ฉ์ ํ๋ ๋คํธ์ํฌ์ ์ ๋ณ์ ์ ์ฌ๋ ์ฌ์ฉ์ ๊ฐ์ ๊ด๊ณ๋ฅผ ํ์
ํ๋ ๊ฒ์ ํ๋ ์์ธก์ ๋์์ด ๋ ์ ์์ผ๋ฉฐ, ์ด๋ฅผ ์ํด ๋ณธ ๋
ผ๋ฌธ์์๋ ๋น์ง๋ํ์ต ๊ธฐ๋ฐ์ผ๋ก ํ๋ ๋คํธ์ํฌ๋ก๋ถํฐ ์ฌ์ฉ์ ๊ฐ ์ฌํ์ ๊ด๊ณ๋ฅผ ์ถ์ถํ๋ ์ฐ๊ตฌ๋ฅผ ์ ์ํ์๋ค.
๊ธฐ์กด์ ์ฐ๊ตฌ๋์๋ ๋ฐฉ๋ฒ๋ค์ ๋ ์ฌ์ฉ์๊ฐ ๋์์ ๋ฐฉ๋ฌธํ๋ ํ์์ธ co-visitation์ ์ค์ ์ ์ผ๋ก ๊ณ ๋ คํ์ฌ ์ฌ์ฉ์ ๊ฐ์ ๊ด๊ณ๋ฅผ ์์ธกํ๊ฑฐ๋, ๋คํธ์ํฌ ์๋ฒ ๋ฉ ๋๋ ๊ทธ๋ํ ์ ๊ฒฝ๋ง(GNN)์ ์ฌ์ฉํ์ฌ ํํ ํ์ต์ ์ํํ์๋ค. ๊ทธ๋ฌ๋ ์ด๋ฌํ ์ ๊ทผ ๋ฐฉ์์ ์ฃผ๊ธฐ์ ์ธ ๋ฐฉ๋ฌธ์ด๋ ์ฅ๊ฑฐ๋ฆฌ ์ด๋ ๋ฑ์ผ๋ก ๋ํ๋๋ ์ฌ์ฉ์์ ํ๋ ํจํด์ ์ ํฌ์ฐฉํ์ง ๋ชปํ๋ค. ํ๋ ํจํด์ ๋ ์ ํ์ตํ๊ธฐ ์ํด, ANES๋ ์ฌ์ฉ์ ์ปจํ
์คํธ ๋ด์์ ์ฌ์ฉ์์ ๊ด์ฌ ์ง์ (POI) ๊ฐ์ ์ธก๋ฉด(Aspect) ์งํฅ ๊ด๊ณ๋ฅผ ํ์ตํ๋ค. ANES๋ User-POI ์ด๋ถ ๊ทธ๋ํ์ ๊ตฌ์กฐ์์ ์ฌ์ฉ์์ ํ๋์ ์ฌ๋ฌ ๊ฐ์ ์ธก๋ฉด์ผ๋ก ๋๋๊ณ , ๊ฐ๊ฐ์ ๊ด๊ณ๋ฅผ ๊ณ ๋ คํ์ฌ ํ๋ ํจํด์ ์ถ์ถํ๋ ์ต์ด์ ๋น์ง๋ํ์ต ๊ธฐ๋ฐ ์ ๊ทผ ๋ฐฉ์์ด๋ค. ์ค์ LBSN ๋ฐ์ดํฐ์์ ์ํ๋ ๊ด๋ฒ์ํ ์คํ์์, ANES๋ ๊ธฐ์กด์ ์ ์๋์๋ ๊ธฐ๋ฒ๋ค๋ณด๋ค ๋์ ์ฑ๋ฅ์ ๋ณด์ฌ์ค๋ค.
์์น ๊ธฐ๋ฐ ์์
๋คํธ์ํฌ์๋ ๋ค๋ฅด๊ฒ, ์ด์ปค๋จธ์ค์ ๋ฆฌ๋ทฐ ์์คํ
์์๋ ์ฌ์ฉ์๋ค์ด ๋ฅ๋์ ์ธ ํ๋ก์ฐ/ํ๋ก์ ๋ฑ์ ํ์๋ฅผ ์ํํ์ง ์๊ณ ๋ ํ๋ซํผ์ ์ํด ์๋ก์ ์ ๋ณด๋ฅผ ์ฃผ๊ณ ๋ฐ๊ณ ์ํฅ๋ ฅ์ ํ์ฌํ๊ฒ ๋๋ค. ์ด์ ๊ฐ์ ์ฌ์ฉ์๋ค์ ํ๋ ํน์ฑ์ ๋ฆฌ๋ทฐ ์คํธ์ ์ํด ์ฝ๊ฒ ์
์ฉ๋ ์ ์๋ค. ๋ฆฌ๋ทฐ ์คํธ์ ์ค์ ์ฌ์ฉ์์ ์๊ฒฌ์ ์จ๊ธฐ๊ณ ํ์ ์ ์กฐ์ํ์ฌ ์๋ชป๋ ์ ๋ณด๋ฅผ ์ ๋ฌํ๋ ๋ฐฉ์์ผ๋ก ์ด๋ฃจ์ด์ง๋ค. ๋๋ ์ด๋ฅผ ํด๊ฒฐํ๊ธฐ ์ํด ์ฌ์ฉ์ ๋ฆฌ๋ทฐ ๋ฐ์ดํฐ์์ ์ฌ์ฉ์ ๊ฐ ์ฌ์ ๊ณต๋ชจ์ฑ(Collusiveness)์ ๊ฐ๋ฅ์ฑ์ ์ฐพ๊ณ , ์ด๋ฅผ ์คํธ ํ์ง์ ํ์ฉํ ๋ฐฉ๋ฒ์ธ SC-Com์ ์ ์ํ๋ค. SC-Com์ ํ๋์ ๊ณต๋ชจ์ฑ์ผ๋ก๋ถํฐ ์ฌ์ฉ์ ๊ฐ ๊ณต๋ชจ ์ ์๋ฅผ ๊ณ์ฐํ๊ณ ํด๋น ์ ์๋ฅผ ๋ฐํ์ผ๋ก ์ ์ฒด ์ฌ์ฉ์๋ฅผ ์ ์ฌํ ์ฌ์ฉ์๋ค์ ์ปค๋ฎค๋ํฐ๋ก ๋ถ๋ฅํ๋ค. ๊ทธ ํ ์คํธ ์ ์ ์ ์ผ๋ฐ ์ ์ ๋ฅผ ๊ตฌ๋ณํ๋ ๋ฐ์ ์ค์ํ ๊ทธ๋ํ ๊ธฐ๋ฐ์ ํน์ง์ ์ถ์ถํ์ฌ ๊ฐ๋
ํ์ต ๊ธฐ๋ฐ์ ๋ถ๋ฅ๊ธฐ์ ์
๋ ฅ ๋ฐ์ดํฐ๋ก ํ์ฉํ๋ ๋ฐฉ๋ฒ์ ์ ์ํ๋ค. SC-Com์ ๊ณต๋ชจ์ฑ์ ๊ฐ๋ ์คํธ ์ ์ ์ ์งํฉ์ ํจ๊ณผ์ ์ผ๋ก ํ์งํ๋ค. ์ค์ ๋ฐ์ดํฐ์
์ ์ด์ฉํ ์คํ์์, SC-Com์ ๊ธฐ์กด ๋
ผ๋ฌธ๋ค ๋๋น ์คํธ ํ์ง์ ๋ฐ์ด๋ ์ฑ๋ฅ์ ๋ณด์ฌ์ฃผ์๋ค.
์ ๋
ผ๋ฌธ์์ ๋ค์ํ ๋ฐ์ดํฐ์ ๋ํด ์ฐ๊ตฌ๋ ์์์ ์ฐ๊ฒฐ๋ง ํ์ง ๋ชจ๋ธ์ ๋ ์ด๋ธ์ด ์๋ ๋ฐ์ดํฐ์ ๋ํด์๋ ์ฌ์ ์ ์ฐ๊ฒฐ๋์์ ๊ฐ๋ฅ์ฑ์ด ๋์ ์ฌ์ฉ์๋ค์ ์์ธกํ๋ฏ๋ก, ์ค์๊ฐ ์์น ๋ฐ์ดํฐ๋, ์ฑ ์ฌ์ฉ ๋ฐ์ดํฐ ๋ฑ์ ๋ค์ํ ๋ฐ์ดํฐ์์ ํ์ฉํ ์ ์๋ ์ ์ฉํ ์ ๋ณด๋ฅผ ์ ๊ณตํ์ฌ ๊ด๊ณ ์ถ์ฒ ์์คํ
์ด๋, ์
์ฑ ์ ์ ํ์ง ๋ฑ์ ๋ถ์ผ์์ ๊ธฐ์ฌํ ์ ์์ ๊ฒ์ผ๋ก ๊ธฐ๋ํ๋ค.Following the exploding usage on online services, people are connected with each other more broadly and widely. In online platforms, people influence each other, and have tendency to reflect their opinions in decision-making. Social Network Services (SNSs) and E-commerce are typical example of online platforms.
User behaviors in online platforms can be defined as relation between user and platform components. A user's purchase is a relationship between a user and a product, and a user's check-in is a relationship between a user and a place. Here, information such as action time, rating, tag, etc. may be included. In many studies, platform user behavior is represented in graph form. At this time, the elements constituting the nodes of the graph are composed of objects such as users and products and places within the platform, and the interaction between the platform elements and the user can be expressed as two nodes being connected.
In this study, I present studies to identify potential networks that affect the user's behavior graph defined on the two platforms.
In ANES, I focus on representation learning for social link inference based on user trajectory data. While traditional methods predict relations between users by considering hand-crafted features, recent studies first perform representation learning using network/node embedding or graph neural networks (GNNs) for downstream tasks such as node classification and link prediction. However, those approaches fail to capture behavioral patterns of individuals ingrained in periodical visits or long-distance movements. To better learn behavioral patterns, this paper proposes a novel scheme called ANES (Aspect-oriented Network Embedding for Social link inference). ANES learns aspect-oriented relations between users and Point-of-Interests (POIs) within their contexts. ANES is the first approach that extracts the complex behavioral pattern of users from both trajectory data and the structure of User-POI bipartite graphs. Extensive experiments on several real-world datasets show that ANES outperforms state-of-the-art baselines.
In contrast to active social networks, people are connected to other users regardless of their intentions in some platforms, such as online shopping websites and restaurant review sites. They do not have any information about each other in advance, and they only have a common point which is that they have visited or have planned to visit same place or purchase a product. Interestingly, users have tendency to be influenced by the review data on their purchase intentions.
Unfortunately, this instinct is easily exploited by opinion spammers. In SC-Com, I focus on opinion spam detection in online shopping services. In many cases, my decision-making process is closely related to online reviews. However, there have been threats of opinion spams by hired reviewers increasingly, which aim to mislead potential customers by hiding genuine consumers opinions. Opinion spams should be filed up collectively to falsify true information. Fortunately, I propose the way to spot the possibility to detect them from their collusiveness. In this paper, I propose SC-Com, an optimized collusive community detection framework. It constructs the graph of reviewers from the collusiveness of behavior and divides a graph by communities based on their mutual suspiciousness. After that, I extract community-based and temporal abnormality features which are critical to discriminate spammers from other genuine users. I show that my method detects collusive opinion spam reviewers effectively and precisely from their collective behavioral patterns. In the real-world dataset, my approach showed prominent performance while only considering primary data such as time and ratings.
These implicit network inference models studied on various data in this thesis predicts users who are likely to be pre-connected to unlabeled data, so it is expected to contribute to areas such as advertising recommendation systems and malicious user detection by providing useful information.Chapter 1 Introduction 1
Chapter 2 Social link Inference in Location-based check-in data 5
2.1 Background 5
2.2 Related Work 12
2.3 Location-based Social Network Service Data 15
2.4 Aspect-wise Graph Decomposition 18
2.5 Aspect-wise Graph learning 19
2.6 Inferring Social Relation from User Representation 21
2.7 Performance Analysis 23
2.8 Discussion and Implications 26
2.9 Summary 34
Chapter 3 Detecting collusiveness from reviews in Online platforms and its application 35
3.1 Background 35
3.2 Related Work 39
3.3 Online Review Data 43
3.4 Collusive Graph Projection 44
3.5 Reviewer Community Detection 47
3.6 Review Community feature extraction and spammer detection 51
3.7 Performance Analysis 53
3.8 Discussion and Implications 55
3.9 Summary 62
Chapter 4 Conclusion 63๋ฐ
Network Representation Learning: A Survey
With the widespread use of information technologies, information networks are
becoming increasingly popular to capture complex relationships across various
disciplines, such as social networks, citation networks, telecommunication
networks, and biological networks. Analyzing these networks sheds light on
different aspects of social life such as the structure of societies,
information diffusion, and communication patterns. In reality, however, the
large scale of information networks often makes network analytic tasks
computationally expensive or intractable. Network representation learning has
been recently proposed as a new learning paradigm to embed network vertices
into a low-dimensional vector space, by preserving network topology structure,
vertex content, and other side information. This facilitates the original
network to be easily handled in the new vector space for further analysis. In
this survey, we perform a comprehensive review of the current literature on
network representation learning in the data mining and machine learning field.
We propose new taxonomies to categorize and summarize the state-of-the-art
network representation learning techniques according to the underlying learning
mechanisms, the network information intended to preserve, as well as the
algorithmic designs and methodologies. We summarize evaluation protocols used
for validating network representation learning including published benchmark
datasets, evaluation methods, and open source algorithms. We also perform
empirical studies to compare the performance of representative algorithms on
common datasets, and analyze their computational complexity. Finally, we
suggest promising research directions to facilitate future study.Comment: Accepted by IEEE transactions on Big Data; 25 pages, 10 tables, 6
figures and 127 reference
Representation learning on heterogeneous spatiotemporal networks
โThe problem of learning latent representations of heterogeneous networks with spatial and temporal attributes has been gaining traction in recent years, given its myriad of real-world applications. Most systems with applications in the field of transportation, urban economics, medical information, online e-commerce, etc., handle big data that can be structured into Spatiotemporal Heterogeneous Networks (SHNs), thereby making efficient analysis of these networks extremely vital. In recent years, representation learning models have proven to be quite efficient in capturing effective lower-dimensional representations of data. But, capturing efficient representations of SHNs continues to pose a challenge for the following reasons: (i) Spatiotemporal data that is structured as SHN encapsulate complex spatial and temporal relationships that exist among real-world objects, rendering traditional feature engineering approaches inefficient and compute-intensive; (ii) Due to the unique nature of the SHNs, existing representation learning techniques cannot be directly adopted to capture their representations.
To address the problem of learning representations of SHNs, four novel frameworks that focus on their unique spatial and temporal characteristics are introduced: (i) collective representation learning, which focuses on quantifying the importance of each latent feature using Laplacian scores; (ii) modality aware representation learning, which learns from the complex user mobility pattern; (iii) distributed representation learning, which focuses on learning human mobility patterns by leveraging Natural Language Processing algorithms; and (iv) representation learning with node sense disambiguation, which learns contrastive senses of nodes in SHNs. The developed frameworks can help us capture higher-order spatial and temporal interactions of real-world SHNs. Through data-driven simulations, machine learning and deep learning models trained on the representations learned from the developed frameworks are proven to be much more efficient and effectiveโ--Abstract, page iii
Recommending on graphs: a comprehensive review from a data perspective
Recent advances in graph-based learning approaches have demonstrated their
effectiveness in modelling users' preferences and items' characteristics for
Recommender Systems (RSS). Most of the data in RSS can be organized into graphs
where various objects (e.g., users, items, and attributes) are explicitly or
implicitly connected and influence each other via various relations. Such a
graph-based organization brings benefits to exploiting potential properties in
graph learning (e.g., random walk and network embedding) techniques to enrich
the representations of the user and item nodes, which is an essential factor
for successful recommendations. In this paper, we provide a comprehensive
survey of Graph Learning-based Recommender Systems (GLRSs). Specifically, we
start from a data-driven perspective to systematically categorize various
graphs in GLRSs and analyze their characteristics. Then, we discuss the
state-of-the-art frameworks with a focus on the graph learning module and how
they address practical recommendation challenges such as scalability, fairness,
diversity, explainability and so on. Finally, we share some potential research
directions in this rapidly growing area.Comment: Accepted by UMUA
A Survey on Graph Representation Learning Methods
Graphs representation learning has been a very active research area in recent
years. The goal of graph representation learning is to generate graph
representation vectors that capture the structure and features of large graphs
accurately. This is especially important because the quality of the graph
representation vectors will affect the performance of these vectors in
downstream tasks such as node classification, link prediction and anomaly
detection. Many techniques are proposed for generating effective graph
representation vectors. Two of the most prevalent categories of graph
representation learning are graph embedding methods without using graph neural
nets (GNN), which we denote as non-GNN based graph embedding methods, and graph
neural nets (GNN) based methods. Non-GNN graph embedding methods are based on
techniques such as random walks, temporal point processes and neural network
learning methods. GNN-based methods, on the other hand, are the application of
deep learning on graph data. In this survey, we provide an overview of these
two categories and cover the current state-of-the-art methods for both static
and dynamic graphs. Finally, we explore some open and ongoing research
directions for future work