94 research outputs found
Using Botnet Technologies to Counteract Network Traffic Analysis
Botnets have been problematic for over a decade. They are used to launch malicious activities including DDoS (Distributed-Denial-of-Service), spamming, identity theft, unauthorized bitcoin mining and malware distribution. A recent nation-wide DDoS attacks caused by the Mirai botnet on 10/21/2016 involving 10s of millions of IP addresses took down Twitter, Spotify, Reddit, The New York Times, Pinterest, PayPal and other major websites. In response to take-down campaigns by security personnel, botmasters have developed technologies to evade detection. The most widely used evasion technique is DNS fast-flux, where the botmaster frequently changes the mapping between domain names and IP addresses of the C&C server so that it will be too late or too costly to trace the C&C server locations. Domain names generated with Domain Generation Algorithms (DGAs) are used as the \u27rendezvous\u27 points between botmasters and bots. This work focuses on how to apply botnet technologies (fast-flux and DGA) to counteract network traffic analysis, therefore protecting user privacy. A better understanding of botnet technologies also helps us be pro-active in defending against botnets. First, we proposed two new DGAs using hidden Markov models (HMMs) and Probabilistic Context-Free Grammars (PCFGs) which can evade current detection methods and systems. Also, we developed two HMM-based DGA detection methods that can detect the botnet DGA-generated domain names with/without training sets. This helps security personnel understand the botnet phenomenon and develop pro-active tools to detect botnets. Second, we developed a distributed proxy system using fast-flux to evade national censorship and surveillance. The goal is to help journalists, human right advocates and NGOs in West Africa to have a secure and free Internet. Then we developed a covert data transport protocol to transform arbitrary message into real DNS traffic. We encode the message into benign-looking domain names generated by an HMM, which represents the statistical features of legitimate domain names. This can be used to evade Deep Packet Inspection (DPI) and protect user privacy in a two-way communication. Both applications serve as examples of applying botnet technologies to legitimate use. Finally, we proposed a new protocol obfuscation technique by transforming arbitrary network protocol into another (Network Time Protocol and a video game protocol of Minecraft as examples) in terms of packet syntax and side-channel features (inter-packet delay and packet size). This research uses botnet technologies to help normal users have secure and private communications over the Internet. From our botnet research, we conclude that network traffic is a malleable and artificial construct. Although existing patterns are easy to detect and characterize, they are also subject to modification and mimicry. This means that we can construct transducers to make any communication pattern look like any other communication pattern. This is neither bad nor good for security. It is a fact that we need to accept and use as best we can
Developing reliable anomaly detection system for critical hosts: a proactive defense paradigm
Current host-based anomaly detection systems have limited accuracy and incur
high processing costs. This is due to the need for processing massive audit data
of the critical host(s) while detecting complex zero-day attacks which can leave
minor, stealthy and dispersed artefacts. In this research study, this observation
is validated using existing datasets and state-of-the-art algorithms related to the
construction of the features of a host's audit data, such as the popular semantic-based
extraction and decision engines, including Support Vector Machines, Extreme
Learning Machines and Hidden Markov Models. There is a challenging
trade-off between achieving accuracy with a minimum processing cost and processing
massive amounts of audit data that can include complex attacks. Also,
there is a lack of a realistic experimental dataset that reflects the normal and
abnormal activities of current real-world computers.
This thesis investigates the development of new methodologies for host-based
anomaly detection systems with the specific aims of improving accuracy at a minimum
processing cost while considering challenges such as complex attacks which,
in some cases, can only be visible via a quantified computing resource, for example,
the execution times of programs, the processing of massive amounts of audit data,
the unavailability of a realistic experimental dataset and the automatic minimization
of the false positive rate while dealing with the dynamics of normal activities.
This study provides three original and significant contributions to this field of
research which represent a marked advance in its body of knowledge.
The first major contribution is the generation and release of a realistic intrusion
detection systems dataset as well as the development of a metric based on fuzzy
qualitative modeling for embedding the possible quality of realism in a dataset's
design process and assessing this quality in existing or future datasets.
The second key contribution is constructing and evaluating the hidden host
features to identify the trivial differences between the normal and abnormal artefacts
of hosts' activities at a minimum processing cost. Linux-centric features include
the frequencies and ranges, frequency-domain representations and Gaussian
interpretations of system call identifiers with execution times while, for Windows,
a count of the distinct core Dynamic Linked Library calls is identified as a hidden
host feature.
The final key contribution is the development of two new anomaly-based statistical
decision engines for capitalizing on the potential of some of the suggested
hidden features and reliably detecting anomalies. The first engine, which has
a forensic module, is based on stochastic theories including Hierarchical hidden
Markov models and the second is modeled using Gaussian Mixture Modeling and
Correntropy. The results demonstrate that the proposed host features and engines
are competent for meeting the identified challenges
Service Quality Assessment for Cloud-based Distributed Data Services
The issue of less-than-100% reliability and trust-worthiness of third-party controlled cloud components (e.g., IaaS and SaaS components from different vendors) may lead to laxity in the QoS guarantees offered by a service-support system S to various applications. An example of S is a replicated data service to handle customer queries with fault-tolerance and performance goals. QoS laxity (i.e., SLA violations) may be inadvertent: say, due to the inability of system designers to model the impact of sub-system behaviors onto a deliverable QoS. Sometimes, QoS laxity may even be intentional: say, to reap revenue-oriented benefits by cheating on resource allocations and/or excessive statistical-sharing of system resources (e.g., VM cycles, number of servers). Our goal is to assess how well the internal mechanisms of S are geared to offer a required level of service to the applications. We use computational models of S to determine the optimal feasible resource schedules and verify how close is the actual system behavior to a model-computed \u27gold-standard\u27. Our QoS assessment methods allow comparing different service vendors (possibly with different business policies) in terms of canonical properties: such as elasticity, linearity, isolation, and fairness (analogical to a comparative rating of restaurants). Case studies of cloud-based distributed applications are described to illustrate our QoS assessment methods.
Specific systems studied in the thesis are: i) replicated data services where the servers may be hosted on multiple data-centers for fault-tolerance and performance reasons; and ii) content delivery networks to geographically distributed clients where the content data caches may reside on different data-centers. The methods studied in the thesis are useful in various contexts of QoS management and self-configurations in large-scale cloud-based distributed systems that are inherently complex due to size, diversity, and environment dynamicity
INTRUSION PREDICTION SYSTEM FOR CLOUD COMPUTING AND NETWORK BASED SYSTEMS
Cloud computing offers cost effective computational and storage services with on-demand scalable capacities according to the customers’ needs. These properties encourage organisations and individuals to migrate from classical computing to cloud computing from different disciplines. Although cloud computing is a trendy technology that opens the horizons for many businesses, it is a new paradigm that exploits already existing computing technologies in new framework rather than being a novel technology. This means that cloud computing inherited classical computing problems that are still challenging. Cloud computing security is considered one of the major problems, which require strong security systems to protect the system, and the valuable data stored and processed in it. Intrusion detection systems are one of the important security components and defence layer that detect cyber-attacks and malicious activities in cloud and non-cloud environments. However, there are some limitations such as attacks were detected at the time that the damage of the attack was already done. In recent years, cyber-attacks have increased rapidly in volume and diversity. In 2013, for example, over 552 million customers’ identities and crucial information were revealed through data breaches worldwide [3]. These growing threats are further demonstrated in the 50,000 daily attacks on the London Stock Exchange [4]. It has been predicted that the economic impact of cyber-attacks will cost the global economy $3 trillion on aggregate by 2020 [5]. This thesis focused on proposing an Intrusion Prediction System that is capable of sensing an attack before it happens in cloud or non-cloud environments. The proposed solution is based on assessing the host system vulnerabilities and monitoring the network traffic for attacks preparations. It has three main modules. The monitoring module observes the network for any intrusion preparations. This thesis proposes a new dynamic-selective statistical algorithm for detecting scan activities, which is part of reconnaissance that represents an essential step in network attack preparation. The proposed method performs a statistical selective analysis for network traffic searching for an attack or intrusion indications. This is achieved by exploring and applying different statistical and probabilistic methods that deal with scan detection. The second module of the prediction system is vulnerabilities assessment that evaluates the weaknesses and faults of the system and measures the probability of the system to fall victim to cyber-attack. Finally, the third module is the prediction module that combines the output of the two modules and performs risk assessments of the system security from intrusions prediction. The results of the conducted experiments showed that the suggested system outperforms the analogous methods in regards to performance of network scan detection, which means accordingly a significant improvement to the security of the targeted system. The scanning detection algorithm has achieved high detection accuracy with 0% false negative and 50% false positive. In term of performance, the detection algorithm consumed only 23% of the data needed for analysis compared to the best performed rival detection method
You Shall Not Pass! Measuring, Predicting, and Detecting Malware Behavior
Researchers have been fighting malicious behavior on the Internet for several decades. The arms race is far from being close to an end, but this PhD work is intended to be another step towards the goal of making the Internet a safer place. My PhD has focused on measuring, predicting, and detecting malicious behavior on the Internet; we focused our efforts towards three different paths: establishing causality relations into malicious actions, predicting the actions taken by an attacker, and detecting malicious software. This work tried to understand the causes of malicious behavior in different scenarios (sandboxing, web browsing), by applying a novel statistical framework and statistical tests to determine what triggers malware. We also used deep learning algorithms to predict what actions an attacker would perform, with the goal of anticipating and countering the attacker’s moves. Moreover, we worked on malware detection for Android, by modeling sequences of API with Markov Chains and applying machine learning algorithms to classify benign and malicious apps. The methodology, design, and results of our research are relevant state of the art in the field; we will go through the different contributions that we worked on during my PhD to explain the design choices, the statistical methods and the takeaways characterizing them. We will show how these systems have an impact on current tools development and future research trends
Analysis and design of security mechanisms in the context of Advanced Persistent Threats against critical infrastructures
Industry 4.0 can be defined as the digitization of all components within the industry, by combining productive processes with leading information and communication technologies. Whereas this integration has several benefits, it has also facilitated the emergence of several attack vectors. These can be leveraged to perpetrate sophisticated attacks such as an Advanced Persistent Threat (APT), that ultimately disrupts and damages critical infrastructural operations with a severe impact.
This doctoral thesis aims to study and design security mechanisms capable of detecting and tracing APTs to ensure the continuity of the production line. Although the basic tools to detect individual attack vectors of an APT have already been developed, it is important to integrate holistic defense solutions in existing critical infrastructures that are capable of addressing all potential threats. Additionally, it is necessary to prospectively analyze the requirements that these systems have to satisfy after the integration of novel services in the upcoming years.
To fulfill these goals, we define a framework for the detection and traceability of APTs in Industry 4.0, which is aimed to fill the gap between classic security mechanisms and APTs. The premise is to retrieve data about the production chain at all levels to correlate events in a distributed way, enabling the traceability of an APT throughout its entire life cycle. Ultimately, these mechanisms make it possible to holistically detect and anticipate attacks in a timely and autonomous way, to deter the propagation and minimize their impact. As a means to validate this framework, we propose some correlation algorithms that implement it (such as the Opinion Dynamics solution) and carry out different experiments that compare the accuracy of response techniques that take advantage of these traceability features. Similarly, we conduct a study on the feasibility of these detection systems in various Industry 4.0 scenarios
Unsupervised Intrusion Detection with Cross-Domain Artificial Intelligence Methods
Cybercrime is a major concern for corporations, business owners, governments and citizens, and it continues to grow in spite of increasing investments in security and fraud prevention. The main challenges in this research field are: being able to detect unknown attacks, and reducing the false positive ratio. The aim of this research work was to target both problems by leveraging four artificial intelligence techniques.
The first technique is a novel unsupervised learning method based on skip-gram modeling. It was designed, developed and tested against a public dataset with popular intrusion patterns. A high accuracy and a low false positive rate were achieved without prior knowledge of attack patterns.
The second technique is a novel unsupervised learning method based on topic modeling. It was applied to three related domains (network attacks, payments fraud, IoT malware traffic). A high accuracy was achieved in the three scenarios, even though the malicious activity significantly differs from one domain to the other.
The third technique is a novel unsupervised learning method based on deep autoencoders, with feature selection performed by a supervised method, random forest. Obtained results showed that this technique can outperform other similar techniques.
The fourth technique is based on an MLP neural network, and is applied to alert reduction in fraud prevention. This method automates manual reviews previously done by human experts, without significantly impacting accuracy
A Quantitative Research Study on Probability Risk Assessments in Critical Infrastructure and Homeland Security
This dissertation encompassed quantitative research on probabilistic risk assessment (PRA) elements in homeland security and the impact on critical infrastructure and key resources. There are 16 crucial infrastructure sectors in homeland security that represent assets, system networks, virtual and physical environments, roads and bridges, transportation, and air travel. The design included the Bayes theorem, a process used in PRAs when determining potential or probable events, causes, outcomes, and risks. The goal is to mitigate the effects of domestic terrorism and natural and man-made disasters, respond to events related to critical infrastructure that can impact the United States, and help protect and secure natural gas pipelines and electrical grid systems. This study provides data from current risk assessment trends in PRAs that can be applied and designed in elements of homeland security and the criminal justice system to help protect critical infrastructures. The dissertation will highlight the aspects of the U.S. Department of Homeland Security National Infrastructure Protection Plan (NIPP). In addition, this framework was employed to examine the criminal justice triangle, explore crime problems and emergency preparedness solutions to protect critical infrastructures, and analyze data relevant to risk assessment procedures for each critical infrastructure identified. Finally, the study addressed the drivers and gaps in research related to protecting and securing natural gas pipelines and electrical grid systems
Blockchain-based Digital Twins:Research Trends, Issues, and Future Challenges
Industrial processes rely on sensory data for decision-making processes, risk assessment, and performance evaluation. Extracting actionable insights from the collected data calls for an infrastructure that can ensure the dissemination of trustworthy data. For the physical data to be trustworthy, it needs to be cross validated through multiple sensor sources with overlapping fields of view. Cross-validated data can then be stored on the blockchain, to maintain its integrity and trustworthiness. Once trustworthy data is recorded on the blockchain, product lifecycle events can be fed into data-driven systems for process monitoring, diagnostics, and optimized control. In this regard, digital twins (DTs) can be leveraged to draw intelligent conclusions from data by identifying the faults and recommending precautionary measures ahead of critical events. Empowering DTs with blockchain in industrial use cases targets key challenges of disparate data repositories, untrustworthy data dissemination, and the need for predictive maintenance. In this survey, while highlighting the key benefits of using blockchain-based DTs, we present a comprehensive review of the state-of-the-art research results for blockchain-based DTs. Based on the current research trends, we discuss a trustworthy blockchain-based DTs framework. We also highlight the role of artificial intelligence in blockchain-based DTs. Furthermore, we discuss the current and future research and deployment challenges of blockchain-supported DTs that require further investigation.</p
- …