38 research outputs found

    Hiding in Plain Sight: A Longitudinal Study of Combosquatting Abuse

    Full text link
    Domain squatting is a common adversarial practice where attackers register domain names that are purposefully similar to popular domains. In this work, we study a specific type of domain squatting called "combosquatting," in which attackers register domains that combine a popular trademark with one or more phrases (e.g., betterfacebook[.]com, youtube-live[.]com). We perform the first large-scale, empirical study of combosquatting by analyzing more than 468 billion DNS records---collected from passive and active DNS data sources over almost six years. We find that almost 60% of abusive combosquatting domains live for more than 1,000 days, and even worse, we observe increased activity associated with combosquatting year over year. Moreover, we show that combosquatting is used to perform a spectrum of different types of abuse including phishing, social engineering, affiliate abuse, trademark abuse, and even advanced persistent threats. Our results suggest that combosquatting is a real problem that requires increased scrutiny by the security community.Comment: ACM CCS 1

    CharBot: A Simple and Effective Method for Evading DGA Classifiers

    Full text link
    Domain generation algorithms (DGAs) are commonly leveraged by malware to create lists of domain names which can be used for command and control (C&C) purposes. Approaches based on machine learning have recently been developed to automatically detect generated domain names in real-time. In this work, we present a novel DGA called CharBot which is capable of producing large numbers of unregistered domain names that are not detected by state-of-the-art classifiers for real-time detection of DGAs, including the recently published methods FANCI (a random forest based on human-engineered features) and LSTM.MI (a deep learning approach). CharBot is very simple, effective and requires no knowledge of the targeted DGA classifiers. We show that retraining the classifiers on CharBot samples is not a viable defense strategy. We believe these findings show that DGA classifiers are inherently vulnerable to adversarial attacks if they rely only on the domain name string to make a decision. Designing a robust DGA classifier may, therefore, necessitate the use of additional information besides the domain name alone. To the best of our knowledge, CharBot is the simplest and most efficient black-box adversarial attack against DGA classifiers proposed to date

    Defending Against Typosquatting Attacks in Programming Language-Based Package Repositories

    Get PDF
    Program size and complexity have dramatically increased over time. To reduce their workload, developers began to utilize package managers. These packages managers allow third-party functionality, contained in units called packages, to be quickly imported into a project. Due to their utility, packages have become remarkably popular. The largest package repository, npm, has more than 1.2 million publicly available packages and serves more than 80 billion package downloads per month. In recent years, this popularity has attracted the attention of malicious users. Attackers have the ability to upload packages which contain malware. To increase the number of victims, attackers regularly leverage a tactic called typosquatting, which involves giving the malicious package a name that is very similar to the name of a popular package. Users who make a typo when trying to install the popular package fall victim to the attack and are instead served the malicious payload. The consequences of typosquatting attacks can be catastrophic. Historical typosquatting attacks have exported passwords, stolen cryptocurrency, and opened reverse shells. This thesis focuses on typosquatting attacks in package repositories. It explores the extent to which typosquatting exists in npm and PyPI (the de facto standard package repositories for Node.js and Python, respectively), proposes a practical defense against typosquatting attacks, and quantifies the efficacy of the proposed defense. The presented solution incurs an acceptable temporal overhead of 2.5% on the standard package installation process and is expected to affect approximately 0.5% of all weekly package downloads. Furthermore, it has been used to discover a particularly high-profile typosquatting perpetrator, which was then reported and has since been deprecated by npm. Typosquatting is an important yet preventable problem. This thesis recommends package creators to protect their own packages with a technique called defensive typosquatting and repository maintainers to protect all users through augmentations to their package managers or automated monitoring of the package namespace

    CharBot : a simple and effective method for evading DGA classifiers

    No full text
    Domain generation algorithms (DGAs) are commonly leveraged by malware to create lists of domain names, which can be used for command and control (C&C) purposes. Approaches based on machine learning have recently been developed to automatically detect generated domain names in real-time. In this paper, we present a novel DGA called CharBot, which is capable of producing large numbers of unregistered domain names that are not detected by state-of-the-art classifiers for real-time detection of the DGAs, including the recently published methods FANCI (a random forest based on human-engineered features) and LSTM.MI (a deep learning approach). The CharBot is very simple, effective, and requires no knowledge of the targeted DGA classifiers. We show that retraining the classifiers on CharBot samples is not a viable defense strategy. We believe these findings show that DGA classifiers are inherently vulnerable to adversarial attacks if they rely only on the domain name string to make a decision. Designing a robust DGA classifier may, therefore, necessitate the use of additional information besides the domain name alone. To the best of our knowledge, the CharBot is the simplest and most efficient black-box adversarial attack against DGA classifiers proposed to date

    Lost in Translation: AI-based Generator of Cross-Language Sound-Squatting

    Get PDF
    Sound-squatting is a phishing attack that tricks users into accessing malicious resources by exploiting similarities in the pronunciation of words. It is an understudied threat that gains traction with the popularity of smart-speakers and the resurgence of content consumption exclusively via audio, such as podcasts. Defending against sound-squatting is complex, and existing solutions rely on manually curated lists of homophones, which limits the search to a few (and mostly existing) words only. We introduce Sound-squatter, a multi-language AI-based system that generates sound-squatting candidates for a proactive defence that covers over 80\% of exact homophones and further generates thousands of high-quality approximated homophones. Sound-squatter relies on a state-of-art Transformer Network to learn transliteration. We search for Sound-squatter generated cross-language sound-squatting domains over hundreds of millions of emitted TLS certificates compared with other types of squatting candidates. Our finding reveals that around 6% of generated sound-squatting candidates have emitted TLS certificates, compared to 8% of other types of squatting candidates. We believe \Sound-squatter uncovers the usage of multilingual sound-squatting phenomenon on the Internet and it is a crucial asset for proactive protection against sound-squatting
    corecore