21 research outputs found

    Hiding in Plain Sight: A Longitudinal Study of Combosquatting Abuse

    Full text link
    Domain squatting is a common adversarial practice where attackers register domain names that are purposefully similar to popular domains. In this work, we study a specific type of domain squatting called "combosquatting," in which attackers register domains that combine a popular trademark with one or more phrases (e.g., betterfacebook[.]com, youtube-live[.]com). We perform the first large-scale, empirical study of combosquatting by analyzing more than 468 billion DNS records---collected from passive and active DNS data sources over almost six years. We find that almost 60% of abusive combosquatting domains live for more than 1,000 days, and even worse, we observe increased activity associated with combosquatting year over year. Moreover, we show that combosquatting is used to perform a spectrum of different types of abuse including phishing, social engineering, affiliate abuse, trademark abuse, and even advanced persistent threats. Our results suggest that combosquatting is a real problem that requires increased scrutiny by the security community.Comment: ACM CCS 1

    Defending Against Typosquatting Attacks in Programming Language-Based Package Repositories

    Get PDF
    Program size and complexity have dramatically increased over time. To reduce their workload, developers began to utilize package managers. These packages managers allow third-party functionality, contained in units called packages, to be quickly imported into a project. Due to their utility, packages have become remarkably popular. The largest package repository, npm, has more than 1.2 million publicly available packages and serves more than 80 billion package downloads per month. In recent years, this popularity has attracted the attention of malicious users. Attackers have the ability to upload packages which contain malware. To increase the number of victims, attackers regularly leverage a tactic called typosquatting, which involves giving the malicious package a name that is very similar to the name of a popular package. Users who make a typo when trying to install the popular package fall victim to the attack and are instead served the malicious payload. The consequences of typosquatting attacks can be catastrophic. Historical typosquatting attacks have exported passwords, stolen cryptocurrency, and opened reverse shells. This thesis focuses on typosquatting attacks in package repositories. It explores the extent to which typosquatting exists in npm and PyPI (the de facto standard package repositories for Node.js and Python, respectively), proposes a practical defense against typosquatting attacks, and quantifies the efficacy of the proposed defense. The presented solution incurs an acceptable temporal overhead of 2.5% on the standard package installation process and is expected to affect approximately 0.5% of all weekly package downloads. Furthermore, it has been used to discover a particularly high-profile typosquatting perpetrator, which was then reported and has since been deprecated by npm. Typosquatting is an important yet preventable problem. This thesis recommends package creators to protect their own packages with a technique called defensive typosquatting and repository maintainers to protect all users through augmentations to their package managers or automated monitoring of the package namespace

    Heuristic methods for efficient identification of abusive domain names

    Get PDF

    Adversarial Matching of Dark Net Market Vendor Accounts

    Get PDF
    Many datasets feature seemingly disparate entries that actually refer to the same entity. Reconciling these entries, or matching, is challenging, especially in situations where there are errors in the data. In certain contexts, the situation is even more complicated: an active adversary may have a vested interest in having the matching process fail. By leveraging eight years of data, we investigate one such adversarial context: matching different online anonymous marketplace vendor handles to unique sellers. Using a combination of random forest classifiers and hierarchical clustering on a set of features that would be hard for an adversary to forge or mimic, we manage to obtain reasonable performance (over 75% precision and recall on labels generated using heuristics), despite generally lacking any ground truth for training. Our algorithm performs particularly well for the top 30% of accounts by sales volume, and hints that 22,163 accounts with at least one confirmed sale map to 15,652 distinct sellers---of which 12,155 operate only one account, and the remainder between 2 and 11 different accounts. Case study analysis further confirms that our algorithm manages to identify non-trivial matches, as well as impersonation attempts

    D-FENS: DNS Filtering & Extraction Network System for Malicious Domain Names

    Get PDF
    While the DNS (Domain Name System) has become a cornerstone for the operation of the Internet, it has also fostered creative cases of maliciousness, including phishing, typosquatting, and botnet communication among others. To address this problem, this dissertation focuses on identifying and mitigating such malicious domain names through prior knowledge and machine learning. In the first part of this dissertation, we explore a method of registering domain names with deliberate typographical mistakes (i.e., typosquatting) to masquerade as popular and well-established domain names. To understand the effectiveness of typosquatting, we conducted a user study which helped shed light on which techniques were more successful than others in deceiving users. While certain techniques fared better than others, they failed to take the context of the user into account. Therefore, in the second part of this dissertation we look at the possibility of an advanced attack which takes context into account when generating domain names. The main idea is determining the possibility for an adversary to improve their success rate of deceiving users with specifically-targeted malicious domain names. While these malicious domains typically target users, other types of domain names are generated by botnets for command & control (C2) communication. Therefore, in the third part of this dissertation we investigate domain generation algorithms (DGA) used by botnets and propose a method to identify DGA-based domain names. By analyzing DNS traffic for certain patterns of NXDomain (non-existent domain) query responses, we can accurately predict DGA-based domain names before they are registered. Given all of these approaches to malicious domain names, we ultimately propose a system called D-FENS (DNS Filtering & Extraction Network System). D-FENS uses machine learning and prior knowledge to accurately predict unreported malicious domain names in real-time, thereby preventing Internet devices from unknowingly connecting to a potentially malicious domain name

    Network-based detection of malicious activities - a corporate network perspective

    Get PDF
    corecore