8 research outputs found

    Context-Aware Network Security.

    Full text link
    The rapid growth in malicious Internet activity, due to the rise of threats like automated worms, viruses, and botnets, has driven the development of tools designed to protect host and network resources. One approach that has gained significant popularity is the use of network based security systems. These systems are deployed on the network to detect, characterize and mitigate both new and existing threats. Unfortunately, these systems are developed and deployed in production networks as generic systems and little thought has been paid to customization. Even when it is possible to customize these devices, the approaches for customization are largely manual or ad hoc. Our observation of the production networks suggest that these networks have significant diversity in end-host characteristics, threat landscape, and traffic behavior -- a collection of features that we call the security context of a network. The scale and diversity in security context of production networks make manual or ad hoc customization of security systems difficult. Our thesis is that automated adaptation to the security context can be used to significantly improve the performance and accuracy of network-based security systems. In order to evaluate our thesis, we explore a system from three broad categories of network-based security systems: known threat detection, new threat detection, and reputation-based mitigation. For known threat detection, we examine a signature-based intrusion detection system and show that the system performance improves significantly if it is aware of the signature set and the traffic characteristics of the network. Second, we explore a large collection of honeypots (or honeynet) that are used to detect new threats. We show that operating system and application configurations in the network impact honeynet accuracy and adapting to the surrounding network provides a significantly better view of the network threats. Last, we apply our context-aware approach to a reputation-based system for spam blacklist generation and show how traffic characteristics on the network can be used to significantly improve its accuracy. We conclude with the lessons learned from our experiences adapting to network security context and the future directions for adapting network-based security systems to the security context.Ph.D.Computer Science & EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/64745/1/sushant_1.pd

    Design and Evaluation of a Real-Time URL Spam Filtering Service

    Full text link

    Learning from textual data streams for detecting email spam

    Get PDF
    This master thesis introduces a method for the detecting email spam through the translation problem in incremental learning of the time series. Common spam detection systems mainly use methods of supervised learning (naive Bayesian classifier, decision trees), while in the master’s thesis presents the classification by using the methods of data stream mining. For learning sets, we also choose the attributes that do not contain personal data and which are not required to obtain the consent of the sender or the recipient (attributes consist the envelope part of e-mail). With the help of algorithms for learning from data streams (VFDT, cVFDT) we used the electronic sequence of messages as text data stream. The results were compared with the traditional spam detection methods and they show that traditional spam detection methods have higher accuracy compared to algorithms for learning from data stream and therefore are not suitable for detecting email spam

    Combating Threats to the Quality of Information in Social Systems

    Get PDF
    Many large-scale social systems such as Web-based social networks, online social media sites and Web-scale crowdsourcing systems have been growing rapidly, enabling millions of human participants to generate, share and consume content on a massive scale. This reliance on users can lead to many positive effects, including large-scale growth in the size and content in the community, bottom-up discovery of “citizen-experts”, serendipitous discovery of new resources beyond the scope of the system designers, and new social-based information search and retrieval algorithms. But the relative openness and reliance on users coupled with the widespread interest and growth of these social systems carries risks and raises growing concerns over the quality of information in these systems. In this dissertation research, we focus on countering threats to the quality of information in self-managing social systems. Concretely, we identify three classes of threats to these systems: (i) content pollution by social spammers, (ii) coordinated campaigns for strategic manipulation, and (iii) threats to collective attention. To combat these threats, we propose three inter-related methods for detecting evidence of these threats, mitigating their impact, and improving the quality of information in social systems. We augment this three-fold defense with an exploration of their origins in “crowdturfing” – a sinister counterpart to the enormous positive opportunities of crowdsourcing. In particular, this dissertation research makes four unique contributions: • The first contribution of this dissertation research is a framework for detecting and filtering social spammers and content polluters in social systems. To detect and filter individual social spammers and content polluters, we propose and evaluate a novel social honeypot-based approach. • Second, we present a set of methods and algorithms for detecting coordinated campaigns in large-scale social systems. We propose and evaluate a content- driven framework for effectively linking free text posts with common “talking points” and extracting campaigns from large-scale social systems. • Third, we present a dual study of the robustness of social systems to collective attention threats through both a data-driven modeling approach and deploy- ment over a real system trace. We evaluate the effectiveness of countermeasures deployed based on the first moments of a bursting phenomenon in a real system. • Finally, we study the underlying ecosystem of crowdturfing for engaging in each of the three threat types. We present a framework for “pulling back the curtain” on crowdturfers to reveal their underlying ecosystem on both crowdsourcing sites and social media

    Securing large cellular networks via a data oriented approach: applications to SMS spam and voice fraud defenses

    Get PDF
    University of Minnesota Ph.D. dissertation. December 2013. Major: Computer Science. Advisor: Zhi-Li Zhang. 1 computer file (PDF); x, 103 pages.With widespread adoption and growing sophistication of mobile devices, fraudsters have turned their attention from landlines and wired networks to cellular networks. While security threats to wireless data channels and applications have attracted the most attention, attacks through mobile voice channels, such as Short Message Service (SMS) spam and voice-related fraud activities also represent a serious threat to mobile users. In particular, it has been reported that the number of spam messages in the US has risen 45% in 2011 to 4.5 billion messages, affecting more than 69% of mobile users globally. Meanwhile, we have seen increasing numbers of incidents where fraudsters deploy malicious apps, e.g., disguised as gaming apps to entice users to download; when invoked, these apps automatically - and without users' knowledge - dial certain (international) phone numbers which charge exorbitantly high fees. Fraudsters also frequently utilize social engineering (e.g., SMS or email spam, Facebook postings) to trick users into dialing these exorbitant fee-charging numbers. Unlike traditional attacks towards data channels, e.g., Email spam and malware, both SMS spam and voice fraud are not only annoying, but they also inflict financial loss to mobile users and cellular carriers as well as adverse impact on cellular network performance. Hence the objective of defense techniques is to restrict phone numbers initialized these activities quickly before they reach too many victims. However, due to the scalability issues and high false alarm rates, anomaly detection based approaches for securing wireless data channels, mobile devices, and applications/services cannot be readily applied here. In this thesis, we share our experience and approach in building operational defense systems against SMS spam and voice fraud in large-scale cellular networks. Our approach is data oriented, i.e., we collect real data from a large national cellular network and exert significant efforts in analyzing and making sense of the data, especially to understand the characteristics of fraudsters and the communication patterns between fraudsters and victims. On top of the data analysis results, we can identify the best predictive features that can alert us of emerging fraud activities. Usually, these features represent unwanted communication patterns which are derived from the original feature space. Using these features, we apply advanced machine learning techniques to train accurate detection models. To ensure the validity of the proposed approaches, we build and deploy the defense systems in operational cellular networks and carry out both extensive off-line evaluation and long-term online trial. To evaluate the system performance, we adopt both direct measurement using known fraudster blacklist provided by fraud agents and indirect measurement by monitoring the change of victim report rates. In both problems, the proposed approaches demonstrate promising results which outperform customer feedback based defenses that have been widely adopted by cellular carriers today.More specifically, using a year (June 2011 to May 2012) of user reported SMS spam messages together with SMS network records collected from a large US based cellular carrier, we carry out a comprehensive study of SMS spamming. Our analysis shows various characteristics of SMS spamming activities. and also reveals that spam numbers with similar content exhibit strong similarity in terms of their sending patterns, tenure, devices and geolocations. Using the insights we have learned from our analysis, we propose several novel spam defense solutions. For example, we devise a novel algorithm for detecting related spam numbers. The algorithm incorporates user spam reports and identifies additional (unreported) spam number candidates which exhibit similar sending patterns at the same network location of the reported spam number during the nearby time period. The algorithm yields a high accuracy of 99.4% on real network data. Moreover, 72% of these spam numbers are detected at least 10 hours before user reports.From a different angle, we present the design of Greystar, a defense solution against the growing SMS spam traffic in cellular networks. By exploiting the fact that most SMS spammers select targets randomly from the finite phone number space, Greystar monitors phone numbers from the gray phone space (which are associated with data only devices like data cards and modems and machine-to-machine communication devices like point-of-sale machines and electricity meters) to alert emerging spamming activities. Greystar employs a novel statistical model for detecting spam numbers based on their footprints on the gray phone space. Evaluation using five month SMS call detail records from a large US cellular carrier shows that Greystar can detect thousands of spam numbers each month with very few false alarms and 15% of the detected spam numbers have never been reported by spam recipients. Moreover, Greystar is much faster than victim spam reports. By deploying Greystar we can reduce 75% spam messages during peak hours. To defend against voice-related fraud activities, we develop a novel methodology for detecting voice-related fraud activities using only call records. More specifically, we advance the notion of voice call graphs to represent voice calls from domestic callers to foreign recipients and propose a Markov Clustering based method for isolating dominant fraud activities from these international calls. Using data collected over a two year period from one of the largest cellular networks in the US, we evaluate the efficacy of the proposed fraud detection algorithm and conduct systematic analysis of the identified fraud activities. Our work sheds light on the unique characteristics and trends of fraud activities in cellular networks, and provides guidance on improving and securing hardware/software architecture to prevent these fraud activities

    Eight Biennial Report : April 2005 – March 2007

    No full text
    corecore