8 research outputs found

    Prediction of drive-by download attacks on Twitter

    Get PDF
    The popularity of Twitter for information discovery, coupled with the automatic shortening of URLs to save space, given the 140 character limit, provides cybercriminals with an opportunity to obfuscate the URL of a malicious Web page within a tweet. Once the URL is obfuscated, the cybercriminal can lure a user to click on it with enticing text and images before carrying out a cyber attack using a malicious Web server. This is known as a drive-by download. In a drive-by download a user’s computer system is infected while interacting with the malicious endpoint, often without them being made aware the attack has taken place. An attacker can gain control of the system by exploiting unpatched system vulnerabilities and this form of attack currently represents one of the most common methods employed. In this paper we build a machine learning model using machine activity data and tweet metadata to move beyond post-execution classification of such URLs as malicious, to predict a URL will be malicious with 0.99 F-measure (using 10-fold cross-validation) and 0.833 (using an unseen test set) at 1 second into the interaction with the URL. Thus providing a basis from which to kill the connection to the server before an attack has completed and proactively blocking and preventing an attack, rather than reacting and repairing at a later date

    Emotions behind drive-by download propagation on Twitter

    Get PDF
    Twitter has emerged as one of the most popular platforms to get updates on entertainment and current events. However, due to its 280 character restriction and automatic shortening of URLs, it is continuously targeted by cybercriminals to carry out drive-by download attacks, where a user’s system is infected by merely visiting a Web page. Popular events that attract a large number of users are used by cybercriminals to infect and propagate malware by using popular hashtags and creating misleading tweets to lure users to malicious Web pages. A drive-by download attack is carried out by obfuscating a malicious URL in an enticing tweet and used as clickbait to lure users to a malicious Web page. In this paper we answer the following two questions: Why are certain malicious tweets retweeted more than others? Do emotions reflecting in a tweet drive virality? We gathered tweets from seven different sporting events over three years and identified those tweets that used to carry to out a drive-by download attack. From the malicious (N=105,642) and benign (N=169,178) data sample identified, we built models to predict information flow size and survival. We define size as the number of retweets of an original tweet, and survival as the duration of the original tweet’s presence in the study window. We selected the zero-truncated negative binomial (ZTNB) regression method for our analysis based on the distribution exhibited by our dependent size measure and the comparison of results with other predictive models. We used the Cox regression technique to model the survival of information flows as it estimates proportional hazard rates for independent measures. Our results show that both social and content factors are statistically significant for the size and survival of information flows for both malicious and benign tweets. In the benign data sample, positive emotions and positive sentiment reflected in the tweet significantly predict size and survival. In contrast, for the malicious data sample, negative emotions, especially fear, are associated with both size and survival of information flows

    Is Covid-19 being used to spread Malware

    Get PDF
    With the rising number of people using social networks after the pandemic of COVID-19, cybercriminals took the advantage of (i) the increased base of possible victims and (ii) the use of a trending topic as the pandemic COVID-19 to lure victims and attract their attention and put malicious content to infect the most possible number of people. Twitter platform forces an auto shortening to any included URL within a 140-character message called “tweet” and this makes it easier for the attackers to include malicious URLs within Tweets. Here comes the need to adopt new approaches to resolve the problem or at least identify it to better understand it to find a suitable solution. One of the proven effective approaches is the adaption of Machine Learning (ML) concepts and applying different algorithms to detect, identify, and even block the propagation of malware. Hence, this study’s main objectives were to collect tweets from Twitter that are related to the topic of COVID-19 and extract features from these tweets and import them as independent variables for the Machine Learning Models to be developed later so they would identify imported tweets as to be malicious or not

    Disrupting drive-by download networks on Twitter.

    Get PDF
    This paper tests disruption strategies in Twitter networks contain-ing malicious URLs used in drive-by download attacks. Cybercriminals usepopular events that attract a large number of Twitter users to infect andpropagate malware by using trending hashtags and creating misleading tweetsto lure users to malicious webpages. Due to Twitter’s 280 character restric-tion and automatic shortening of URLs, it is particularly susceptible to thepropagation of malware involved in drive-by download attacks. Consideringthe number of online users and the network formed by retweeting a tweet, acybercriminal can infect millions of users in a short period. Policymakers andresearchers have struggled to develop an efficient network disruption strategyto stop malware propagation effectively. We define an efficient strategy as onethat considers network topology and dependency on network resilience, whereresilience is the ability of the network to continue to disseminate informationeven when users are removed from it. One of the challenges faced while curbingmalware propagation on online social platforms is understanding the cyber-criminal network spreading the malware. Combining computational modellingand social network analysis we identify the most effective strategy for dis-rupting networks of malicious URLs. Our results emphasise the importanceof specific network disruption parameters such as network and emotion fea-tures, which have proven to be more effective in disrupting malicious networkscompared to random strategies. In conclusion, disruption strategies force cy-bercriminal networks to become more vulnerable by strategically removing malicious users, which causes successful network disruption to become a long-term effort

    Why is Machine Learning Security so hard?

    Get PDF
    The increase of available data and computing power has fueled a wide application of machine learning (ML). At the same time, security concerns are raised: ML models were shown to be easily fooled by slight perturbations on their inputs. Furthermore, by querying a model and analyzing output and input pairs, an attacker can infer the training data or replicate the model, thereby harming the owner’s intellectual property. Also, altering the training data can lure the model into producing specific or generally wrong outputs at test time. So far, none of the attacks studied in the field has been satisfactorily defended. In this work, we shed light on these difficulties. We first consider classifier evasion or adversarial examples. The computation of such examples is an inherent problem, as opposed to a bug that can be fixed. We also show that adversarial examples often transfer from one model to another, different model. Afterwards, we point out that the detection of backdoors (a training-time attack) is hindered as natural backdoor-like patterns occur even in benign neural networks. The question whether a pattern is benign or malicious then turns into a question of intention, which is hard to tackle. A different kind of complexity is added with the large libraries nowadays in use to implement machine learning. We introduce an attack that alters the library, thereby decreasing the accuracy a user can achieve. In case the user is aware of the attack, however, it is straightforward to defeat. This is not the case for most classical attacks described above. Additional difficulty is added if several attacks are studied at once: we show that even if the model is configured for one attack to be less effective, another attack might perform even better. We conclude by pointing out the necessity of understanding the ML model under attack. On the one hand, as we have seen throughout the examples given here, understanding precedes defenses and attacks. On the other hand, an attack, even a failed one, often yields new insights and knowledge about the algorithm studied.This work was supported by the German Federal Ministry of Education and Research (BMBF) through funding for the Center for IT-Security,Privacy and Accountability (CISPA) (FKZ: 16KIS0753
    corecore