94 research outputs found

    Using Botnet Technologies to Counteract Network Traffic Analysis

    Get PDF
    Botnets have been problematic for over a decade. They are used to launch malicious activities including DDoS (Distributed-Denial-of-Service), spamming, identity theft, unauthorized bitcoin mining and malware distribution. A recent nation-wide DDoS attacks caused by the Mirai botnet on 10/21/2016 involving 10s of millions of IP addresses took down Twitter, Spotify, Reddit, The New York Times, Pinterest, PayPal and other major websites. In response to take-down campaigns by security personnel, botmasters have developed technologies to evade detection. The most widely used evasion technique is DNS fast-flux, where the botmaster frequently changes the mapping between domain names and IP addresses of the C&C server so that it will be too late or too costly to trace the C&C server locations. Domain names generated with Domain Generation Algorithms (DGAs) are used as the \u27rendezvous\u27 points between botmasters and bots. This work focuses on how to apply botnet technologies (fast-flux and DGA) to counteract network traffic analysis, therefore protecting user privacy. A better understanding of botnet technologies also helps us be pro-active in defending against botnets. First, we proposed two new DGAs using hidden Markov models (HMMs) and Probabilistic Context-Free Grammars (PCFGs) which can evade current detection methods and systems. Also, we developed two HMM-based DGA detection methods that can detect the botnet DGA-generated domain names with/without training sets. This helps security personnel understand the botnet phenomenon and develop pro-active tools to detect botnets. Second, we developed a distributed proxy system using fast-flux to evade national censorship and surveillance. The goal is to help journalists, human right advocates and NGOs in West Africa to have a secure and free Internet. Then we developed a covert data transport protocol to transform arbitrary message into real DNS traffic. We encode the message into benign-looking domain names generated by an HMM, which represents the statistical features of legitimate domain names. This can be used to evade Deep Packet Inspection (DPI) and protect user privacy in a two-way communication. Both applications serve as examples of applying botnet technologies to legitimate use. Finally, we proposed a new protocol obfuscation technique by transforming arbitrary network protocol into another (Network Time Protocol and a video game protocol of Minecraft as examples) in terms of packet syntax and side-channel features (inter-packet delay and packet size). This research uses botnet technologies to help normal users have secure and private communications over the Internet. From our botnet research, we conclude that network traffic is a malleable and artificial construct. Although existing patterns are easy to detect and characterize, they are also subject to modification and mimicry. This means that we can construct transducers to make any communication pattern look like any other communication pattern. This is neither bad nor good for security. It is a fact that we need to accept and use as best we can

    Developing reliable anomaly detection system for critical hosts: a proactive defense paradigm

    Full text link
    Current host-based anomaly detection systems have limited accuracy and incur high processing costs. This is due to the need for processing massive audit data of the critical host(s) while detecting complex zero-day attacks which can leave minor, stealthy and dispersed artefacts. In this research study, this observation is validated using existing datasets and state-of-the-art algorithms related to the construction of the features of a host's audit data, such as the popular semantic-based extraction and decision engines, including Support Vector Machines, Extreme Learning Machines and Hidden Markov Models. There is a challenging trade-off between achieving accuracy with a minimum processing cost and processing massive amounts of audit data that can include complex attacks. Also, there is a lack of a realistic experimental dataset that reflects the normal and abnormal activities of current real-world computers. This thesis investigates the development of new methodologies for host-based anomaly detection systems with the specific aims of improving accuracy at a minimum processing cost while considering challenges such as complex attacks which, in some cases, can only be visible via a quantified computing resource, for example, the execution times of programs, the processing of massive amounts of audit data, the unavailability of a realistic experimental dataset and the automatic minimization of the false positive rate while dealing with the dynamics of normal activities. This study provides three original and significant contributions to this field of research which represent a marked advance in its body of knowledge. The first major contribution is the generation and release of a realistic intrusion detection systems dataset as well as the development of a metric based on fuzzy qualitative modeling for embedding the possible quality of realism in a dataset's design process and assessing this quality in existing or future datasets. The second key contribution is constructing and evaluating the hidden host features to identify the trivial differences between the normal and abnormal artefacts of hosts' activities at a minimum processing cost. Linux-centric features include the frequencies and ranges, frequency-domain representations and Gaussian interpretations of system call identifiers with execution times while, for Windows, a count of the distinct core Dynamic Linked Library calls is identified as a hidden host feature. The final key contribution is the development of two new anomaly-based statistical decision engines for capitalizing on the potential of some of the suggested hidden features and reliably detecting anomalies. The first engine, which has a forensic module, is based on stochastic theories including Hierarchical hidden Markov models and the second is modeled using Gaussian Mixture Modeling and Correntropy. The results demonstrate that the proposed host features and engines are competent for meeting the identified challenges

    Service Quality Assessment for Cloud-based Distributed Data Services

    Full text link
    The issue of less-than-100% reliability and trust-worthiness of third-party controlled cloud components (e.g., IaaS and SaaS components from different vendors) may lead to laxity in the QoS guarantees offered by a service-support system S to various applications. An example of S is a replicated data service to handle customer queries with fault-tolerance and performance goals. QoS laxity (i.e., SLA violations) may be inadvertent: say, due to the inability of system designers to model the impact of sub-system behaviors onto a deliverable QoS. Sometimes, QoS laxity may even be intentional: say, to reap revenue-oriented benefits by cheating on resource allocations and/or excessive statistical-sharing of system resources (e.g., VM cycles, number of servers). Our goal is to assess how well the internal mechanisms of S are geared to offer a required level of service to the applications. We use computational models of S to determine the optimal feasible resource schedules and verify how close is the actual system behavior to a model-computed \u27gold-standard\u27. Our QoS assessment methods allow comparing different service vendors (possibly with different business policies) in terms of canonical properties: such as elasticity, linearity, isolation, and fairness (analogical to a comparative rating of restaurants). Case studies of cloud-based distributed applications are described to illustrate our QoS assessment methods. Specific systems studied in the thesis are: i) replicated data services where the servers may be hosted on multiple data-centers for fault-tolerance and performance reasons; and ii) content delivery networks to geographically distributed clients where the content data caches may reside on different data-centers. The methods studied in the thesis are useful in various contexts of QoS management and self-configurations in large-scale cloud-based distributed systems that are inherently complex due to size, diversity, and environment dynamicity

    INTRUSION PREDICTION SYSTEM FOR CLOUD COMPUTING AND NETWORK BASED SYSTEMS

    Get PDF
    Cloud computing offers cost effective computational and storage services with on-demand scalable capacities according to the customers’ needs. These properties encourage organisations and individuals to migrate from classical computing to cloud computing from different disciplines. Although cloud computing is a trendy technology that opens the horizons for many businesses, it is a new paradigm that exploits already existing computing technologies in new framework rather than being a novel technology. This means that cloud computing inherited classical computing problems that are still challenging. Cloud computing security is considered one of the major problems, which require strong security systems to protect the system, and the valuable data stored and processed in it. Intrusion detection systems are one of the important security components and defence layer that detect cyber-attacks and malicious activities in cloud and non-cloud environments. However, there are some limitations such as attacks were detected at the time that the damage of the attack was already done. In recent years, cyber-attacks have increased rapidly in volume and diversity. In 2013, for example, over 552 million customers’ identities and crucial information were revealed through data breaches worldwide [3]. These growing threats are further demonstrated in the 50,000 daily attacks on the London Stock Exchange [4]. It has been predicted that the economic impact of cyber-attacks will cost the global economy $3 trillion on aggregate by 2020 [5]. This thesis focused on proposing an Intrusion Prediction System that is capable of sensing an attack before it happens in cloud or non-cloud environments. The proposed solution is based on assessing the host system vulnerabilities and monitoring the network traffic for attacks preparations. It has three main modules. The monitoring module observes the network for any intrusion preparations. This thesis proposes a new dynamic-selective statistical algorithm for detecting scan activities, which is part of reconnaissance that represents an essential step in network attack preparation. The proposed method performs a statistical selective analysis for network traffic searching for an attack or intrusion indications. This is achieved by exploring and applying different statistical and probabilistic methods that deal with scan detection. The second module of the prediction system is vulnerabilities assessment that evaluates the weaknesses and faults of the system and measures the probability of the system to fall victim to cyber-attack. Finally, the third module is the prediction module that combines the output of the two modules and performs risk assessments of the system security from intrusions prediction. The results of the conducted experiments showed that the suggested system outperforms the analogous methods in regards to performance of network scan detection, which means accordingly a significant improvement to the security of the targeted system. The scanning detection algorithm has achieved high detection accuracy with 0% false negative and 50% false positive. In term of performance, the detection algorithm consumed only 23% of the data needed for analysis compared to the best performed rival detection method

    You Shall Not Pass! Measuring, Predicting, and Detecting Malware Behavior

    Get PDF
    Researchers have been fighting malicious behavior on the Internet for several decades. The arms race is far from being close to an end, but this PhD work is intended to be another step towards the goal of making the Internet a safer place. My PhD has focused on measuring, predicting, and detecting malicious behavior on the Internet; we focused our efforts towards three different paths: establishing causality relations into malicious actions, predicting the actions taken by an attacker, and detecting malicious software. This work tried to understand the causes of malicious behavior in different scenarios (sandboxing, web browsing), by applying a novel statistical framework and statistical tests to determine what triggers malware. We also used deep learning algorithms to predict what actions an attacker would perform, with the goal of anticipating and countering the attacker’s moves. Moreover, we worked on malware detection for Android, by modeling sequences of API with Markov Chains and applying machine learning algorithms to classify benign and malicious apps. The methodology, design, and results of our research are relevant state of the art in the field; we will go through the different contributions that we worked on during my PhD to explain the design choices, the statistical methods and the takeaways characterizing them. We will show how these systems have an impact on current tools development and future research trends

    Analysis and design of security mechanisms in the context of Advanced Persistent Threats against critical infrastructures

    Get PDF
    Industry 4.0 can be defined as the digitization of all components within the industry, by combining productive processes with leading information and communication technologies. Whereas this integration has several benefits, it has also facilitated the emergence of several attack vectors. These can be leveraged to perpetrate sophisticated attacks such as an Advanced Persistent Threat (APT), that ultimately disrupts and damages critical infrastructural operations with a severe impact. This doctoral thesis aims to study and design security mechanisms capable of detecting and tracing APTs to ensure the continuity of the production line. Although the basic tools to detect individual attack vectors of an APT have already been developed, it is important to integrate holistic defense solutions in existing critical infrastructures that are capable of addressing all potential threats. Additionally, it is necessary to prospectively analyze the requirements that these systems have to satisfy after the integration of novel services in the upcoming years. To fulfill these goals, we define a framework for the detection and traceability of APTs in Industry 4.0, which is aimed to fill the gap between classic security mechanisms and APTs. The premise is to retrieve data about the production chain at all levels to correlate events in a distributed way, enabling the traceability of an APT throughout its entire life cycle. Ultimately, these mechanisms make it possible to holistically detect and anticipate attacks in a timely and autonomous way, to deter the propagation and minimize their impact. As a means to validate this framework, we propose some correlation algorithms that implement it (such as the Opinion Dynamics solution) and carry out different experiments that compare the accuracy of response techniques that take advantage of these traceability features. Similarly, we conduct a study on the feasibility of these detection systems in various Industry 4.0 scenarios

    Unsupervised Intrusion Detection with Cross-Domain Artificial Intelligence Methods

    Get PDF
    Cybercrime is a major concern for corporations, business owners, governments and citizens, and it continues to grow in spite of increasing investments in security and fraud prevention. The main challenges in this research field are: being able to detect unknown attacks, and reducing the false positive ratio. The aim of this research work was to target both problems by leveraging four artificial intelligence techniques. The first technique is a novel unsupervised learning method based on skip-gram modeling. It was designed, developed and tested against a public dataset with popular intrusion patterns. A high accuracy and a low false positive rate were achieved without prior knowledge of attack patterns. The second technique is a novel unsupervised learning method based on topic modeling. It was applied to three related domains (network attacks, payments fraud, IoT malware traffic). A high accuracy was achieved in the three scenarios, even though the malicious activity significantly differs from one domain to the other. The third technique is a novel unsupervised learning method based on deep autoencoders, with feature selection performed by a supervised method, random forest. Obtained results showed that this technique can outperform other similar techniques. The fourth technique is based on an MLP neural network, and is applied to alert reduction in fraud prevention. This method automates manual reviews previously done by human experts, without significantly impacting accuracy

    A Quantitative Research Study on Probability Risk Assessments in Critical Infrastructure and Homeland Security

    Get PDF
    This dissertation encompassed quantitative research on probabilistic risk assessment (PRA) elements in homeland security and the impact on critical infrastructure and key resources. There are 16 crucial infrastructure sectors in homeland security that represent assets, system networks, virtual and physical environments, roads and bridges, transportation, and air travel. The design included the Bayes theorem, a process used in PRAs when determining potential or probable events, causes, outcomes, and risks. The goal is to mitigate the effects of domestic terrorism and natural and man-made disasters, respond to events related to critical infrastructure that can impact the United States, and help protect and secure natural gas pipelines and electrical grid systems. This study provides data from current risk assessment trends in PRAs that can be applied and designed in elements of homeland security and the criminal justice system to help protect critical infrastructures. The dissertation will highlight the aspects of the U.S. Department of Homeland Security National Infrastructure Protection Plan (NIPP). In addition, this framework was employed to examine the criminal justice triangle, explore crime problems and emergency preparedness solutions to protect critical infrastructures, and analyze data relevant to risk assessment procedures for each critical infrastructure identified. Finally, the study addressed the drivers and gaps in research related to protecting and securing natural gas pipelines and electrical grid systems

    Blockchain-based Digital Twins:Research Trends, Issues, and Future Challenges

    Get PDF
    Industrial processes rely on sensory data for decision-making processes, risk assessment, and performance evaluation. Extracting actionable insights from the collected data calls for an infrastructure that can ensure the dissemination of trustworthy data. For the physical data to be trustworthy, it needs to be cross validated through multiple sensor sources with overlapping fields of view. Cross-validated data can then be stored on the blockchain, to maintain its integrity and trustworthiness. Once trustworthy data is recorded on the blockchain, product lifecycle events can be fed into data-driven systems for process monitoring, diagnostics, and optimized control. In this regard, digital twins (DTs) can be leveraged to draw intelligent conclusions from data by identifying the faults and recommending precautionary measures ahead of critical events. Empowering DTs with blockchain in industrial use cases targets key challenges of disparate data repositories, untrustworthy data dissemination, and the need for predictive maintenance. In this survey, while highlighting the key benefits of using blockchain-based DTs, we present a comprehensive review of the state-of-the-art research results for blockchain-based DTs. Based on the current research trends, we discuss a trustworthy blockchain-based DTs framework. We also highlight the role of artificial intelligence in blockchain-based DTs. Furthermore, we discuss the current and future research and deployment challenges of blockchain-supported DTs that require further investigation.</p
    corecore