86 research outputs found

    Early detection of spam-related activity

    Get PDF
    Spam, the distribution of unsolicited bulk email, is a big security threat on the Internet. Recent studies show approximately 70-90% of the worldwide email traffic—about 70 billion messages a day—is spam. Spam consumes resources on the network and at mail servers, and it is also used to launch other attacks on users, such as distributing malware or phishing. Spammers have increased their virulence and resilience by sending spam from large collections of compromised machines (“botnets”). Spammers also make heavy use of URLs and domains to direct victims to point-of-sale Web sites, and miscreants register large number of domains to evade blacklisting efforts. To mitigate the threat of spam, users and network administrators need proactive techniques to distinguish spammers from legitimate senders and to take down online spam-advertised sites. In this dissertation, we focus on characterizing spam-related activities and developing systems to detect them early. Our work builds on the observation that spammers need to acquire attack agility to be profitable, which presents differences in how spammers and legitimate users interact with Internet services and exposes detectable during early period of attack. We examine several important components across the spam life cycle, including spam dissemination that aims to reach users' inboxes, the hosting process during which spammers set DNS servers and Web servers, and the naming process to acquire domain names via registration services. We first develop a new spam-detection system based on network-level features of spamming bots. These lightweight features allow the system to scale better and to be more robust. Next, we analyze DNS resource records and lookups from top-level domain servers during the initial stage after domain registrations, which provides a global view across the Internet to characterize spam hosting infrastructure. We further examine the domain registration process and present the unique registration behavior of spammers. Finally, we build an early-warning system to identify spammer domains at time-of-registration rather than later at time-of-use. We have demonstrated that our detection systems are effective by using real-world datasets. Our work has also had practical impact. Some of the network-level features that we identified have since been incorporated into spam filtering products at Yahoo! and McAfee, and our work on detecting spammer domains at time-of-registration has directly influenced new projects at Verisign to investigate domain registrations.Ph.D

    Using Botnet Technologies to Counteract Network Traffic Analysis

    Get PDF
    Botnets have been problematic for over a decade. They are used to launch malicious activities including DDoS (Distributed-Denial-of-Service), spamming, identity theft, unauthorized bitcoin mining and malware distribution. A recent nation-wide DDoS attacks caused by the Mirai botnet on 10/21/2016 involving 10s of millions of IP addresses took down Twitter, Spotify, Reddit, The New York Times, Pinterest, PayPal and other major websites. In response to take-down campaigns by security personnel, botmasters have developed technologies to evade detection. The most widely used evasion technique is DNS fast-flux, where the botmaster frequently changes the mapping between domain names and IP addresses of the C&C server so that it will be too late or too costly to trace the C&C server locations. Domain names generated with Domain Generation Algorithms (DGAs) are used as the \u27rendezvous\u27 points between botmasters and bots. This work focuses on how to apply botnet technologies (fast-flux and DGA) to counteract network traffic analysis, therefore protecting user privacy. A better understanding of botnet technologies also helps us be pro-active in defending against botnets. First, we proposed two new DGAs using hidden Markov models (HMMs) and Probabilistic Context-Free Grammars (PCFGs) which can evade current detection methods and systems. Also, we developed two HMM-based DGA detection methods that can detect the botnet DGA-generated domain names with/without training sets. This helps security personnel understand the botnet phenomenon and develop pro-active tools to detect botnets. Second, we developed a distributed proxy system using fast-flux to evade national censorship and surveillance. The goal is to help journalists, human right advocates and NGOs in West Africa to have a secure and free Internet. Then we developed a covert data transport protocol to transform arbitrary message into real DNS traffic. We encode the message into benign-looking domain names generated by an HMM, which represents the statistical features of legitimate domain names. This can be used to evade Deep Packet Inspection (DPI) and protect user privacy in a two-way communication. Both applications serve as examples of applying botnet technologies to legitimate use. Finally, we proposed a new protocol obfuscation technique by transforming arbitrary network protocol into another (Network Time Protocol and a video game protocol of Minecraft as examples) in terms of packet syntax and side-channel features (inter-packet delay and packet size). This research uses botnet technologies to help normal users have secure and private communications over the Internet. From our botnet research, we conclude that network traffic is a malleable and artificial construct. Although existing patterns are easy to detect and characterize, they are also subject to modification and mimicry. This means that we can construct transducers to make any communication pattern look like any other communication pattern. This is neither bad nor good for security. It is a fact that we need to accept and use as best we can

    Damage Detection and Mitigation in Open Collaboration Applications

    Get PDF
    Collaborative functionality is changing the way information is amassed, refined, and disseminated in online environments. A subclass of these systems characterized by open collaboration uniquely allow participants to *modify* content with low barriers-to-entry. A prominent example and our case study, English Wikipedia, exemplifies the vulnerabilities: 7%+ of its edits are blatantly unconstructive. Our measurement studies show this damage manifests in novel socio-technical forms, limiting the effectiveness of computational detection strategies from related domains. In turn this has made much mitigation the responsibility of a poorly organized and ill-routed human workforce. We aim to improve all facets of this incident response workflow. Complementing language based solutions we first develop content agnostic predictors of damage. We implicitly glean reputations for system entities and overcome sparse behavioral histories with a spatial reputation model combining evidence from multiple granularity. We also identify simple yet indicative metadata features that capture participatory dynamics and content maturation. When brought to bear over damage corpora our contributions: (1) advance benchmarks over a broad set of security issues ( vandalism ), (2) perform well in the first anti-spam specific approach, and (3) demonstrate their portability over diverse open collaboration use cases. Probabilities generated by our classifiers can also intelligently route human assets using prioritization schemes optimized for capture rate or impact minimization. Organizational primitives are introduced that improve workforce efficiency. The whole of these strategies are then implemented into a tool ( STiki ) that has been used to revert 350,000+ damaging instances from Wikipedia. These uses are analyzed to learn about human aspects of the edit review process, properties including scalability, motivation, and latency. Finally, we conclude by measuring practical impacts of work, discussing how to better integrate our solutions, and revealing outstanding vulnerabilities that speak to research challenges for open collaboration security

    On Detection of Current and Next-Generation Botnets.

    Full text link
    Botnets are one of the most serious security threats to the Internet and its end users. A botnet consists of compromised computers that are remotely coordinated by a botmaster under a Command and Control (C&C) infrastructure. Driven by financial incentives, botmasters leverage botnets to conduct various cybercrimes such as spamming, phishing, identity theft and Distributed-Denial-of-Service (DDoS) attacks. There are three main challenges facing botnet detection. First, code obfuscation is widely employed by current botnets, so signature-based detection is insufficient. Second, the C&C infrastructure of botnets has evolved rapidly. Any detection solution targeting one botnet instance can hardly keep up with this change. Third, the proliferation of powerful smartphones presents a new platform for future botnets. Defense techniques designed for existing botnets may be outsmarted when botnets invade smartphones. Recognizing these challenges, this dissertation proposes behavior-based botnet detection solutions at three different levels---the end host, the edge network and the Internet infrastructure---from a small scale to a large scale, and investigates the next-generation botnet targeting smartphones. It (1) addresses the problem of botnet seeding by devising a per-process containment scheme for end-host systems; (2) proposes a hybrid botnet detection framework for edge networks utilizing combined host- and network-level information; (3) explores the structural properties of botnet topologies and measures network components' capabilities of large-scale botnet detection at the Internet infrastructure level; and (4) presents a proof-of-concept mobile botnet employing SMS messages as the C&C and P2P as the topology to facilitate future research on countermeasures against next-generation botnets. The dissertation makes three primary contributions. First, the detection solutions proposed utilize intrinsic and fundamental behavior of botnets and are immune to malware obfuscation and traffic encryption. Second, the solutions are general enough to identify different types of botnets, not a specific botnet instance. They can also be extended to counter next-generation botnet threats. Third, the detection solutions function at multiple levels to meet various detection needs. They each take a different perspective but are highly complementary to each other, forming an integrated botnet detection framework.Ph.D.Computer Science & EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/91382/1/gracez_1.pd

    Understanding Bots on Social Media - An Application in Disaster Response

    Get PDF
    abstract: Social media has become a primary platform for real-time information sharing among users. News on social media spreads faster than traditional outlets and millions of users turn to this platform to receive the latest updates on major events especially disasters. Social media bridges the gap between the people who are affected by disasters, volunteers who offer contributions, and first responders. On the other hand, social media is a fertile ground for malicious users who purposefully disturb the relief processes facilitated on social media. These malicious users take advantage of social bots to overrun social media posts with fake images, rumors, and false information. This process causes distress and prevents actionable information from reaching the affected people. Social bots are automated accounts that are controlled by a malicious user and these bots have become prevalent on social media in recent years. In spite of existing efforts towards understanding and removing bots on social media, there are at least two drawbacks associated with the current bot detection algorithms: general-purpose bot detection methods are designed to be conservative and not label a user as a bot unless the algorithm is highly confident and they overlook the effect of users who are manipulated by bots and (unintentionally) spread their content. This study is trifold. First, I design a Machine Learning model that uses content and context of social media posts to detect actionable ones among them; it specifically focuses on tweets in which people ask for help after major disasters. Second, I focus on bots who can be a facilitator of malicious content spreading during disasters. I propose two methods for detecting bots on social media with a focus on the recall of the detection. Third, I study the characteristics of users who spread the content of malicious actors. These features have the potential to improve methods that detect malicious content such as fake news.Dissertation/ThesisDoctoral Dissertation Computer Science 201

    Evaluation of Email Spam Detection Techniques

    Get PDF
    Email has become a vital form of communication among individuals and organizations in today’s world. However, simultaneously it became a threat to many users in the form of spam emails which are also referred as junk/unsolicited emails. Most of the spam emails received by the users are in the form of commercial advertising, which usually carry computer viruses without any notifications. Today, 95% of the email messages across the world are believed to be spam, therefore it is essential to develop spam detection techniques. There are different techniques to detect and filter the spam emails, but off recently all the developed techniques are being implemented successfully to minimize the threats. This paper describes how the current spam email detection approaches are determining and evaluating the problems. There are different types of techniques developed based on Reputation, Origin, Words, Multimedia, Textual, Community, Rules, Hybrid, Machine learning, Fingerprint, Social networks, Protocols, Traffic analysis, OCR techniques, Low-level features, and many other techniques. All these filtering techniques are developed to detect and evaluate spam emails. Along with classification of the email messages into spam or ham, this paper also demonstrates the effectiveness and accuracy of the spam detection techniques

    INSTANT MESSAGING SPAM DETECTION IN LONG TERM EVOLUTION NETWORKS

    Get PDF
    The lack of efficient spam detection modules for packet data communication is resulting to increased threat exposure for the telecommunication network users and the service providers. In this thesis, we propose a novel approach to classify spam at the server side by intercepting packet-data communication among instant messaging applications. Spam detection is performed using machine learning techniques on packet headers and contents (if unencrypted) in two different phases: offline training and online classification. The contribution of this study is threefold. First, it identifies the scope of deploying a spam detection module in a state-of-the-art telecommunication architecture. Secondly, it compares the usefulness of various existing machine learning algorithms in order to intercept and classify data packets in near real-time communication of the instant messengers. Finally, it evaluates the accuracy and classification time of spam detection using our approach in a simulated environment of continuous packet data communication. Our research results are mainly generated by executing instances of a peer-to-peer instant messaging application prototype within a simulated Long Term Evolution (LTE) telecommunication network environment. This prototype is modeled and executed using OPNET network modeling and simulation tools. The research produces considerable knowledge on addressing unsolicited packet monitoring in instant messaging and similar applications

    Graph Mining for Cybersecurity: A Survey

    Full text link
    The explosive growth of cyber attacks nowadays, such as malware, spam, and intrusions, caused severe consequences on society. Securing cyberspace has become an utmost concern for organizations and governments. Traditional Machine Learning (ML) based methods are extensively used in detecting cyber threats, but they hardly model the correlations between real-world cyber entities. In recent years, with the proliferation of graph mining techniques, many researchers investigated these techniques for capturing correlations between cyber entities and achieving high performance. It is imperative to summarize existing graph-based cybersecurity solutions to provide a guide for future studies. Therefore, as a key contribution of this paper, we provide a comprehensive review of graph mining for cybersecurity, including an overview of cybersecurity tasks, the typical graph mining techniques, and the general process of applying them to cybersecurity, as well as various solutions for different cybersecurity tasks. For each task, we probe into relevant methods and highlight the graph types, graph approaches, and task levels in their modeling. Furthermore, we collect open datasets and toolkits for graph-based cybersecurity. Finally, we outlook the potential directions of this field for future research
    • …
    corecore