Search CORE

5,633 research outputs found

Detecting Compromised Implicit Association Test Results Using Supervised Learning

Author: Boldt Brendon
Breimer Eric
While Zack
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 03/09/2019
Field of study

An implicit association test is a human psychological test used to measure subconscious associations. While widely recognized by psychologists as an effective tool in measuring attitudes and biases, the validity of the results can be compromised if a subject does not follow the instructions or attempts to manipulate the outcome. Compared to previous work, we collect training data using a more generalized methodology. We train a variety of different classifiers to identify a participant's first attempt versus a second possibly compromised attempt. To compromise the second attempt, participants are shown their score and are instructed to change it using one of five randomly selected deception methods. Compared to previous work, our methodology demonstrates a more robust and practical framework for accurately identifying a wide variety of deception techniques applicable to the IAT.Comment: 6 pages, 1 figur

arXiv.org e-Print Archive

Crossref

Malicious Package Detection in NPM and PyPI using a Single Model of Malicious Behavior Sequence

Author: Chen Bihuan
Huang Kaifeng
Peng Xin
Tian Zhenhao
Wang Chong
Zhang Junan
Publication venue
Publication date: 05/09/2023
Field of study

Open-source software (OSS) supply chain enlarges the attack surface, which makes package registries attractive targets for attacks. Recently, package registries NPM and PyPI have been flooded with malicious packages. The effectiveness of existing malicious NPM and PyPI package detection approaches is hindered by two challenges. The first challenge is how to leverage the knowledge of malicious packages from different ecosystems in a unified way such that multi-lingual malicious package detection can be feasible. The second challenge is how to model malicious behavior in a sequential way such that maliciousness can be precisely captured. To address the two challenges, we propose and implement Cerebro to detect malicious packages in NPM and PyPI. We curate a feature set based on a high-level abstraction of malicious behavior to enable multi-lingual knowledge fusing. We organize extracted features into a behavior sequence to model sequential malicious behavior. We fine-tune the BERT model to understand the semantics of malicious behavior. Extensive evaluation has demonstrated the effectiveness of Cerebro over the state-of-the-art as well as the practically acceptable efficiency. Cerebro has successfully detected 306 and 196 new malicious packages in PyPI and NPM, and received 385 thank letters from the official PyPI and NPM teams

arXiv.org e-Print Archive

Intrusion detection using machine learning-hardened domain generation algorithms

Author: Shihab Mustafa Abdulmajeed
Publication venue: 'International University of Sarajevo'
Publication date: 16/12/2020
Field of study

Machine learning has recently been applied in a variety of areas in information technology due to its superiority over the typical computer algorithms. The machine learning approaches are being integrated into cybersecurity detection approaches with the primary aim of supporting or providing an alternative to the first line of defense in networks. Although the automation of these detection and analysis systems is potent in today’s changing technological environment, the usefulness of machine learning in cybersecurity requires evaluation. In this research, we present an analysis and address cybersecurity concerns of machine learning techniques used in the detection of intrusion, spam, and malware. The analysis will entail the evaluation of the current maturity of the machine learning solutions when identifying their primary limitations, which has prevented the immediate adoption of machine learning in cybersecurity

Periodicals of Engineering and Natural Sciences (PEN - International University of Sarajevo)

소셜 네트워크와 이커머스 플랫폼에서의 잠재 네트워크 마이닝

Author: 변형호
Publication venue: 서울대학교 대학원
Publication date: 01/02/2023
Field of study

학위논문(박사) -- 서울대학교대학원 : 공과대학 컴퓨터공학부, 2023. 2. 권태경.웹 기반 서비스의 폭발적인 발달로 사용자들은 온라인 상에서 폭넓게 연결되고 있다. 온라인 플랫폼 상에서, 사용자들은 서로에게 영향을 주고받으며 의사 결정에 그들의 경험과 의견을 반영하는 경향을 보인다. 본 학위 논문에서는 대표적인 온라인 플랫폼인 소셜 네트워크 서비스와 이커머스 플랫폼에서의 사용자 행동에 대해 연구하였다. 온라인 플랫폼에서의 사용자 행동은 사용자와 플랫폼 구성 요소 간의 관계로 표현할 수 있다. 사용자의 구매는 사용자와 상품 간의 관계로, 사용자의 체크인은 사용자와 장소 간의 관계로 나타내진다. 여기에 행동의 시간과 레이팅, 태그 등의 정보가 포함될 수 있다. 본 연구에서는 두 플랫폼에서 정의된 사용자의 행동 그래프에 영향을 미치는 잠재 네트워크를 파악하는 연구를 제시한다. 위치 기반의 소셜 네트워크 서비스의 경우 특정 장소에 방문하는 체크인 형식으로 많은 포스트가 만들어지는데, 사용자의 장소 방문은 사용자 간에 사전에 존재하는 친구 관계에 의해 영향을 크게 받는다. 사용자 활동 네트워크의 저변에 잠재된 사용자 간의 관계를 파악하는 것은 활동 예측에 도움이 될 수 있으며, 이를 위해 본 논문에서는 비지도학습 기반으로 활동 네트워크로부터 사용자 간 사회적 관계를 추출하는 연구를 제안하였다. 기존에 연구되었던 방법들은 두 사용자가 동시에 방문하는 행위인 co-visitation을 중점적으로 고려하여 사용자 간의 관계를 예측하거나, 네트워크 임베딩 또는 그래프 신경망(GNN)을 사용하여 표현 학습을 수행하였다. 그러나 이러한 접근 방식은 주기적인 방문이나 장거리 이동 등으로 대표되는 사용자의 행동 패턴을 잘 포착하지 못한다. 행동 패턴을 더 잘 학습하기 위해, ANES는 사용자 컨텍스트 내에서 사용자와 관심 지점(POI) 간의 측면(Aspect) 지향 관계를 학습한다. ANES는 User-POI 이분 그래프의 구조에서 사용자의 행동을 여러 개의 측면으로 나누고, 각각의 관계를 고려하여 행동 패턴을 추출하는 최초의 비지도학습 기반 접근 방식이다. 실제 LBSN 데이터에서 수행된 광범위한 실험에서, ANES는 기존에 제안되었던 기법들보다 높은 성능을 보여준다. 위치 기반 소셜 네트워크와는 다르게, 이커머스의 리뷰 시스템에서는 사용자들이 능동적인 팔로우/팔로잉 등의 행위를 수행하지 않고도 플랫폼에 의해 서로의 정보를 주고받고 영향력을 행사하게 된다. 이와 같은 사용자들의 행동 특성은 리뷰 스팸에 의해 쉽게 악용될 수 있다. 리뷰 스팸은 실제 사용자의 의견을 숨기고 평점을 조작하여 잘못된 정보를 전달하는 방식으로 이루어진다. 나는 이를 해결하기 위해 사용자 리뷰 데이터에서 사용자 간 사전 공모성(Collusiveness)의 가능성을 찾고, 이를 스팸 탐지에 활용한 방법인 SC-Com을 제안한다. SC-Com은 행동의 공모성으로부터 사용자 간 공모 점수를 계산하고 해당 점수를 바탕으로 전체 사용자를 유사한 사용자들의 커뮤니티로 분류한다. 그 후 스팸 유저와 일반 유저를 구별하는 데에 중요한 그래프 기반의 특징을 추출하여 감독 학습 기반의 분류기의 입력 데이터로 활용하는 방법을 제시한다. SC-Com은 공모성을 갖는 스팸 유저의 집합을 효과적으로 탐지한다. 실제 데이터셋을 이용한 실험에서, SC-Com은 기존 논문들 대비 스팸 탐지에 뛰어난 성능을 보여주었다. 위 논문에서 다양한 데이터에 대해 연구된 암시적 연결망 탐지 모델은 레이블이 없는 데이터에 대해서도 사전에 연결되었을 가능성이 높은 사용자들을 예측하므로, 실시간 위치 데이터나, 앱 사용 데이터 등의 다양한 데이터에서 활용할 수 있는 유용한 정보를 제공하여 광고 추천 시스템이나, 악성 유저 탐지 등의 분야에서 기여할 수 있을 것으로 기대한다.Following the exploding usage on online services, people are connected with each other more broadly and widely. In online platforms, people influence each other, and have tendency to reflect their opinions in decision-making. Social Network Services (SNSs) and E-commerce are typical example of online platforms. User behaviors in online platforms can be defined as relation between user and platform components. A user's purchase is a relationship between a user and a product, and a user's check-in is a relationship between a user and a place. Here, information such as action time, rating, tag, etc. may be included. In many studies, platform user behavior is represented in graph form. At this time, the elements constituting the nodes of the graph are composed of objects such as users and products and places within the platform, and the interaction between the platform elements and the user can be expressed as two nodes being connected. In this study, I present studies to identify potential networks that affect the user's behavior graph defined on the two platforms. In ANES, I focus on representation learning for social link inference based on user trajectory data. While traditional methods predict relations between users by considering hand-crafted features, recent studies first perform representation learning using network/node embedding or graph neural networks (GNNs) for downstream tasks such as node classification and link prediction. However, those approaches fail to capture behavioral patterns of individuals ingrained in periodical visits or long-distance movements. To better learn behavioral patterns, this paper proposes a novel scheme called ANES (Aspect-oriented Network Embedding for Social link inference). ANES learns aspect-oriented relations between users and Point-of-Interests (POIs) within their contexts. ANES is the first approach that extracts the complex behavioral pattern of users from both trajectory data and the structure of User-POI bipartite graphs. Extensive experiments on several real-world datasets show that ANES outperforms state-of-the-art baselines. In contrast to active social networks, people are connected to other users regardless of their intentions in some platforms, such as online shopping websites and restaurant review sites. They do not have any information about each other in advance, and they only have a common point which is that they have visited or have planned to visit same place or purchase a product. Interestingly, users have tendency to be influenced by the review data on their purchase intentions. Unfortunately, this instinct is easily exploited by opinion spammers. In SC-Com, I focus on opinion spam detection in online shopping services. In many cases, my decision-making process is closely related to online reviews. However, there have been threats of opinion spams by hired reviewers increasingly, which aim to mislead potential customers by hiding genuine consumers opinions. Opinion spams should be filed up collectively to falsify true information. Fortunately, I propose the way to spot the possibility to detect them from their collusiveness. In this paper, I propose SC-Com, an optimized collusive community detection framework. It constructs the graph of reviewers from the collusiveness of behavior and divides a graph by communities based on their mutual suspiciousness. After that, I extract community-based and temporal abnormality features which are critical to discriminate spammers from other genuine users. I show that my method detects collusive opinion spam reviewers effectively and precisely from their collective behavioral patterns. In the real-world dataset, my approach showed prominent performance while only considering primary data such as time and ratings. These implicit network inference models studied on various data in this thesis predicts users who are likely to be pre-connected to unlabeled data, so it is expected to contribute to areas such as advertising recommendation systems and malicious user detection by providing useful information.Chapter 1 Introduction 1 Chapter 2 Social link Inference in Location-based check-in data 5 2.1 Background 5 2.2 Related Work 12 2.3 Location-based Social Network Service Data 15 2.4 Aspect-wise Graph Decomposition 18 2.5 Aspect-wise Graph learning 19 2.6 Inferring Social Relation from User Representation 21 2.7 Performance Analysis 23 2.8 Discussion and Implications 26 2.9 Summary 34 Chapter 3 Detecting collusiveness from reviews in Online platforms and its application 35 3.1 Background 35 3.2 Related Work 39 3.3 Online Review Data 43 3.4 Collusive Graph Projection 44 3.5 Reviewer Community Detection 47 3.6 Review Community feature extraction and spammer detection 51 3.7 Performance Analysis 53 3.8 Discussion and Implications 55 3.9 Summary 62 Chapter 4 Conclusion 63박

SNU Open Repository and Archive

Training with More Confidence: Mitigating Injected and Natural Backdoors During Training

Author: Ding Hailun
Ma Shiqing
Wang Zhenting
Zhai Juan
Publication venue
Publication date: 27/10/2022
Field of study

The backdoor or Trojan attack is a severe threat to deep neural networks (DNNs). Researchers find that DNNs trained on benign data and settings can also learn backdoor behaviors, which is known as the natural backdoor. Existing works on anti-backdoor learning are based on weak observations that the backdoor and benign behaviors can differentiate during training. An adaptive attack with slow poisoning can bypass such defenses. Moreover, these methods cannot defend natural backdoors. We found the fundamental differences between backdoor-related neurons and benign neurons: backdoor-related neurons form a hyperplane as the classification surface across input domains of all affected labels. By further analyzing the training process and model architectures, we found that piece-wise linear functions cause this hyperplane surface. In this paper, we design a novel training method that forces the training to avoid generating such hyperplanes and thus remove the injected backdoors. Our extensive experiments on five datasets against five state-of-the-art attacks and also benign training show that our method can outperform existing state-of-the-art defenses. On average, the ASR (attack success rate) of the models trained with NONE is 54.83 times lower than undefended models under standard poisoning backdoor attack and 1.75 times lower under the natural backdoor attack. Our code is available at https://github.com/RU-System-Software-and-Security/NONE

arXiv.org e-Print Archive