4 research outputs found

    Resilience Strategies for Network Challenge Detection, Identification and Remediation

    Get PDF
    The enormous growth of the Internet and its use in everyday life make it an attractive target for malicious users. As the network becomes more complex and sophisticated it becomes more vulnerable to attack. There is a pressing need for the future internet to be resilient, manageable and secure. Our research is on distributed challenge detection and is part of the EU Resumenet Project (Resilience and Survivability for Future Networking: Framework, Mechanisms and Experimental Evaluation). It aims to make networks more resilient to a wide range of challenges including malicious attacks, misconfiguration, faults, and operational overloads. Resilience means the ability of the network to provide an acceptable level of service in the face of significant challenges; it is a superset of commonly used definitions for survivability, dependability, and fault tolerance. Our proposed resilience strategy could detect a challenge situation by identifying an occurrence and impact in real time, then initiating appropriate remedial action. Action is autonomously taken to continue operations as much as possible and to mitigate the damage, and allowing an acceptable level of service to be maintained. The contribution of our work is the ability to mitigate a challenge as early as possible and rapidly detect its root cause. Also our proposed multi-stage policy based challenge detection system identifies both the existing and unforeseen challenges. This has been studied and demonstrated with an unknown worm attack. Our multi stage approach reduces the computation complexity compared to the traditional single stage, where one particular managed object is responsible for all the functions. The approach we propose in this thesis has the flexibility, scalability, adaptability, reproducibility and extensibility needed to assist in the identification and remediation of many future network challenges

    Dynamic reliability and security monitoring: a virtual machine approach

    Get PDF
    While one always works to prevent attacks and failures, they are inevitable and situational awareness is key to taking appropriate action. Monitoring plays an integral role in ensuring reliability and security of computing systems. Infrastructure as a Service (IaaS) clouds significantly lower the barrier for obtaining scalable computing resources and allow users to focus on what is important to them. Can a similar service be offered to provide on-demand reliability and security monitoring? Cloud computing systems are typically built using virtual machines (VMs). VM monitoring takes advantage of this and uses the hypervisor that runs VMs for robust reliability and security monitoring. The hypervisor provides an environment that is isolated from failures and attacks inside customers’ VMs. Furthermore, as a low-level manager of computing resources, the hypervisor has full access to the infrastructure running above it. Hypervisor-based VM monitoring leverages that information to observe the VMs for failures and attacks. However, existing VM monitoring techniques fall short of “as-a-service” expectations because they require a priori VM modifications and require human interaction to obtain necessary information about the underlying guest system. The research presented in this dissertation closes those gaps by providing a flexible VM monitoring framework and automated analysis to support that framework. We have developed and tested a dynamic VM monitoring framework called Hypervisor Probes (hprobes). The hprobe framework allows us to monitor the execution of both the guest OS and applications from the hypervisor. To supplement this monitoring framework, we use dynamic analysis techniques to investigate the relationship between hardware events visible to the hyper-visor and OS constructs common across OS versions. We use the results of this analysis to parametrize the hprobe-based monitors without requiring any user input. Combining the dynamic VM monitoring framework and analysis frameworks allows us to provide on-demand hypervisor based monitors for cloud VMs

    Efficient traffic trajectory error detection

    Get PDF
    Our recent survey on publicly reported router bugs shows that many router bugs, once triggered, can cause various traffic trajectory errors including traffic deviating from its intended forwarding paths, traffic being mistakenly dropped and unauthorized traffic bypassing packet filters. These traffic trajectory errors are serious problems because they may cause network applications to fail and create security loopholes for network intruders to exploit. Therefore, traffic trajectory errors must be quickly and efficiently detected so that the corrective action can be performed in a timely fashion. Detecting traffic trajectory errors requires the real-time tracking of the control states (e.g., forwarding tables, packet filters) of routers and the scalable monitoring of the actual traffic trajectories in the network. Traffic trajectory errors can then be detected by efficiently comparing the observed traffic trajectories against the intended control states. Making such trajectory error detection efficient and practical for large-scale high speed networks requires us to address many challenges. First, existing traffic trajectory monitoring algorithms require the simultaneously monitoring of all network interfaces in a network for the packets of interest, which will cause a daunting monitoring overhead. To improve the efficiency of traffic trajectory monitoring, we propose the router group monitoring technique that only monitors the periphery interfaces of a set of selected router groups. We analyze a large number of real network topologies and show that effective router groups with high trajectory error detection rates exist in all cases. We then develop an analytical model for quickly and accurately estimating the detection rates of different router groups. Based on this model, we propose an algorithm to select a set of router groups that can achieve complete error detection and low monitoring overhead. Second, maintaining the control states of all the routers in the network requires a significant amount of memory. However, there exist no studies on how to efficiently store multiple complex packet filters. We propose to store multiple packet filters using a shared Hyper- Cuts decision tree. To help decide which subset of packet filters should share a HyperCuts decision tree, we first identify a number of important factors that collectively impact the efficiency of the resulting shared HyperCuts decision tree. Based on the identified factors, we then propose to use machine learning techniques to predict whether any pair of packet filters should share a tree. Given the pair-wise prediction matrix, a greedy heuristic algorithm is used to classify packet filters into a number of shared HyperCuts decision trees. Our experiments using both real packet filters and synthetic packet filters show that our shared HyperCuts decision trees require considerably less memory while having the same or a slightly higher average height than separate trees. In addition, the shared HyperCuts decision trees enable concurrent lookup of multiple packet filters sharing the same tree. Finally, based on the two proposed techniques, we have implemented a complete prototype system that is compatible with Juniper's JUNOS. We have shown in the thesis that, to detect traffic trajectory errors, it is sufficient to only selectively implement a small set of key functions of a full-fletched router on our prototype, which makes our prototype simpler and less error prone. We conduct both Emulab experiments and micro-benchmark experiments to show that the system can efficiently track router control states, monitor traffic trajectories and detect traffic trajectory errors

    A Topological Analysis of Monitor Placement

    No full text
    corecore