12 research outputs found

    Utilizing Machine Learning Classifiers to Identify SSH Brute Force Attacks

    Get PDF
    SSH brute force attacks are a type of network attack in which an attacker tries to guess the username and password of a user on the Secure Shell protocol. This kind of attack is simple to perform, and the results from a successfully compromised system can lead to a number of destructive outcomes. Because of its simplicity and potential payout, large networks experience many instances of these attacks in their traffic, and current prevention methods rely heavily on per-machine logs that, in aggregate, take up a large amount of space. This paper explores the usage of machine learning algorithms in detecting and preventing these kinds of attacks as an alternative to the firewall techniques used today. We use three different classifiers - naĂŻve Bayes, K-nearest neighbors, and decision trees - on a publicly available dataset of labeled network flows to try and classify unknown network flows into benign and SSH brute force categories. Our results show that machine learning is very well suited for this task, with all of our classifiers having accuracy scores of over 85% in the classification of our test data

    Graph clustering and anomaly detection of access control log for forensic purposes

    Get PDF
    Attacks on operating system access control have become a significant and increasingly common problem. This type of security threat is recorded in a forensic artifact such as an authentication log. Forensic investigators will generally examine the log to analyze such incidents. An anomaly is highly correlated to an attacker's attempts to compromise the system. In this paper, we propose a novel method to automatically detect an anomaly in the access control log of an operating system. The logs will be first preprocessed and then clustered using an improved MajorClust algorithm to get a better cluster. This technique provides parameter-free clustering so that it automatically can produce an analysis report for the forensic investigators. The clustering results will be checked for anomalies based on a score that considers some factors such as the total members in a cluster, the frequency of the events in the log file, and the inter-arrival time of a specific activity. We also provide a graph-based visualization of logs to assist the investigators with easy analysis. Experimental results compiled on an open dataset of a Linux authentication log show that the proposed method achieved the accuracy of 83.14% in the authentication log dataset

    A Survey on Enterprise Network Security: Asset Behavioral Monitoring and Distributed Attack Detection

    Full text link
    Enterprise networks that host valuable assets and services are popular and frequent targets of distributed network attacks. In order to cope with the ever-increasing threats, industrial and research communities develop systems and methods to monitor the behaviors of their assets and protect them from critical attacks. In this paper, we systematically survey related research articles and industrial systems to highlight the current status of this arms race in enterprise network security. First, we discuss the taxonomy of distributed network attacks on enterprise assets, including distributed denial-of-service (DDoS) and reconnaissance attacks. Second, we review existing methods in monitoring and classifying network behavior of enterprise hosts to verify their benign activities and isolate potential anomalies. Third, state-of-the-art detection methods for distributed network attacks sourced from external attackers are elaborated, highlighting their merits and bottlenecks. Fourth, as programmable networks and machine learning (ML) techniques are increasingly becoming adopted by the community, their current applications in network security are discussed. Finally, we highlight several research gaps on enterprise network security to inspire future research.Comment: Journal paper submitted to Elseive

    The Dark Menace: Characterizing Network-based Attacks in the Cloud

    Get PDF
    ABSTRACT As the cloud computing market continues to grow, the cloud platform is becoming an attractive target for attackers to disrupt services and steal data, and to compromise resources to launch attacks. In this paper, using three months of NetFlow data in 2013 from a large cloud provider, we present the first large-scale characterization of inbound attacks towards the cloud and outbound attacks from the cloud. We investigate nine types of attacks ranging from network-level attacks such as DDoS to application-level attacks such as SQL injection and spam. Our analysis covers the complexity, intensity, duration, and distribution of these attacks, highlighting the key challenges in defending against attacks in the cloud. By characterizing the diversity of cloud attacks, we aim to motivate the research community towards developing future security solutions for cloud systems

    Detecting Stealthy, Distributed SSH Brute-Forcing

    No full text
    In this work we propose a general approach for detecting distributed malicious activity in which individual attack sources each operate in a stealthy, low-profile manner. We base our approach on observing statistically significant changes in a parameter that summarizes aggregate activity, bracketing a distributed attack in time, and then determining which sources present during that interval appear to have coordinated their activity. We apply this approach to the problem of detecting stealthy distributed SSH bruteforcing activity, showing that we can model the process of legitimate users failing to authenticate using a beta-binomial distribution, which enables us to tune a detector that trades off an expected level of false positives versus time-to-detection. Using the detector we study the prevalence of distributed bruteforcing, finding dozens of instances in an extensive 8-year dataset collected from a site with several thousand SSH users. Many of the attacks—some of which last months—would be quite difficult to detect individually. While a number of the attacks reflect indiscriminant global probing, we also find attacks that targeted only the local site, as well as occasional attacks that succeeded

    Detection and prevention of username enumeration attack on SSH protocol: machine learning approach

    Get PDF
    A Dissertation Submitted in Partial Fulfillment of the Requirement for the Degree of Master’s in Information System and Network Security of the Nelson Mandela African Institution of Science and TechnologyOver the last two decades (2000–2020), the Internet has rapidly evolved, resulting in symmetrical and asymmetrical Internet consumption patterns and billions of users worldwide. With the immense rise of the Internet, attacks and malicious behaviors pose a huge threat to our computing environment. Brute-force attack is among the most prominent and commonly used attacks, achieved out using password-attack tools, a wordlist dictionary, and a usernames list – obtained through a so – called an enumeration attack. In this study, we investigate username enumeration attack detection on SSH protocol by using machine-learning classifiers. We apply four asymmetrical classifiers on our generated dataset collected from a closed environment network to build machine-learning-based models for attack detection. The use of several machine-learners offers a wider investigation spectrum of the classifiers’ ability in attack detection. Additionally, we investigate how beneficial it is to include or exclude network ports information as features-set in the process of learning. We evaluated and compared the performances of machine-learning models for both cases. The models used are k-nearest neighbor (KNN), naïve Bayes (NB), random forest (RF) and decision tree (DT) with and without ports information. Our results show that machine-learning approaches to detect SSH username enumeration attacks were quite successful, with KNN having an accuracy of 99.93%, NB 95.70%, RF 99.92%, and DT 99.88%. Furthermore, the results improved when using ports information. The best selected model was then deployed into intrusion detection and prevention system (IDS/IPS) to automatically detect and prevent username enumeration attack. Study also recommends the use of Deep Learning in future studies

    Enhanced Prediction of Network Attacks Using Incomplete Data

    Get PDF
    For years, intrusion detection has been considered a key component of many organizations’ network defense capabilities. Although a number of approaches to intrusion detection have been tried, few have been capable of providing security personnel responsible for the protection of a network with sufficient information to make adjustments and respond to attacks in real-time. Because intrusion detection systems rarely have complete information, false negatives and false positives are extremely common, and thus valuable resources are wasted responding to irrelevant events. In order to provide better actionable information for security personnel, a mechanism for quantifying the confidence level in predictions is needed. This work presents an approach which seeks to combine a primary prediction model with a novel secondary confidence level model which provides a measurement of the confidence in a given attack prediction being made. The ability to accurately identify an attack and quantify the confidence level in the prediction could serve as the basis for a new generation of intrusion detection devices, devices that provide earlier and better alerts for administrators and allow more proactive response to events as they are occurring

    Harnessing the Power of Multi-Source Data: an Exploration of Diversity and Similarity.

    Full text link
    This dissertation studies a sequence of problems concerning the collection and utilization of data from disparate sources, e.g., that arising in a crowd-sourcing system. It aims at developing learning methods to enhance the quality of decision-making and learning task performance by exploiting a multitude of diversity, similarity and interdependency inherent in a crowd-sourcing system and among disparate data sources. We start our study with a family of problems on sequential decision-making combined with data collection in a crowd-sourcing system, where the goal is to improve the quality of data input or computational output, while reducing the cost in using such a system. In this context, the learning methods we develop are closed-loop and online, i.e., decisions made are functions of past data observations, present actions determine future observations, and the learning occurs as data inputs arrive. The similarity and disparity among different data sources help us in some cases to speed up the learning process (e.g., in a recommender system), and in some other cases to perform quality control over data input for which ground-truth may be non-existent or cannot be obtained directly (e.g., in a crowd-sourcing market using Amazon Mechanical Turks (AMTs)). We then apply our algorithms to the processing of a large set of network malicious activity data collected from diverse sources, with a goal of uncovering interconnectedness/similarity between different network entities' malicious behaviors. Specifically, we apply our online prediction algorithm presented and analyzed in earlier parts of the dissertation to this data and show its effectiveness in predicting next-day maliciousness. Furthermore, we show that data-specific properties of this set of data allow us to map networks' behavioral similarity to similarity in their topological features. This in turn enables prediction even in the absence of measurement data.PhDElectrical Engineering: SystemsUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/120717/1/youngliu_1.pd

    A Brave New World: Studies on the Deployment and Security of the Emerging IPv6 Internet.

    Full text link
    Recent IPv4 address exhaustion events are ushering in a new era of rapid transition to the next generation Internet protocol---IPv6. Via Internet-scale experiments and data analysis, this dissertation characterizes the adoption and security of the emerging IPv6 network. The work includes three studies, each the largest of its kind, examining various facets of the new network protocol's deployment, routing maturity, and security. The first study provides an analysis of ten years of IPv6 deployment data, including quantifying twelve metrics across ten global-scale datasets, and affording a holistic understanding of the state and recent progress of the IPv6 transition. Based on cross-dataset analysis of relative global adoption rates and across features of the protocol, we find evidence of a marked shift in the pace and nature of adoption in recent years and observe that higher-level metrics of adoption lag lower-level metrics. Next, a network telescope study covering the IPv6 address space of the majority of allocated networks provides insight into the early state of IPv6 routing. Our analyses suggest that routing of average IPv6 prefixes is less stable than that of IPv4. This instability is responsible for the majority of the captured misdirected IPv6 traffic. Observed dark (unallocated destination) IPv6 traffic shows substantial differences from the unwanted traffic seen in IPv4---in both character and scale. Finally, a third study examines the state of IPv6 network security policy. We tested a sample of 25 thousand routers and 520 thousand servers against sets of TCP and UDP ports commonly targeted by attackers. We found systemic discrepancies between intended security policy---as codified in IPv4---and deployed IPv6 policy. Such lapses in ensuring that the IPv6 network is properly managed and secured are leaving thousands of important devices more vulnerable to attack than before IPv6 was enabled. Taken together, findings from our three studies suggest that IPv6 has reached a level and pace of adoption, and shows patterns of use, that indicates serious production employment of the protocol on a broad scale. However, weaker IPv6 routing and security are evident, and these are leaving early dual-stack networks less robust than the IPv4 networks they augment.PhDComputer Science and EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/120689/1/jczyz_1.pd
    corecore