379 research outputs found

    Resilience Strategies for Network Challenge Detection, Identification and Remediation

    Get PDF
    The enormous growth of the Internet and its use in everyday life make it an attractive target for malicious users. As the network becomes more complex and sophisticated it becomes more vulnerable to attack. There is a pressing need for the future internet to be resilient, manageable and secure. Our research is on distributed challenge detection and is part of the EU Resumenet Project (Resilience and Survivability for Future Networking: Framework, Mechanisms and Experimental Evaluation). It aims to make networks more resilient to a wide range of challenges including malicious attacks, misconfiguration, faults, and operational overloads. Resilience means the ability of the network to provide an acceptable level of service in the face of significant challenges; it is a superset of commonly used definitions for survivability, dependability, and fault tolerance. Our proposed resilience strategy could detect a challenge situation by identifying an occurrence and impact in real time, then initiating appropriate remedial action. Action is autonomously taken to continue operations as much as possible and to mitigate the damage, and allowing an acceptable level of service to be maintained. The contribution of our work is the ability to mitigate a challenge as early as possible and rapidly detect its root cause. Also our proposed multi-stage policy based challenge detection system identifies both the existing and unforeseen challenges. This has been studied and demonstrated with an unknown worm attack. Our multi stage approach reduces the computation complexity compared to the traditional single stage, where one particular managed object is responsible for all the functions. The approach we propose in this thesis has the flexibility, scalability, adaptability, reproducibility and extensibility needed to assist in the identification and remediation of many future network challenges

    DISco: a Distributed Information Store for network Challenges and their Outcome

    Full text link
    We present DISco, a storage and communication middleware designed to enable distributed and task-centric autonomic control of networks. DISco is designed to enable multi-agent identification of anomalous situations -- so-called "challenges" -- and assist coordinated remediation that maintains degraded -- but acceptable -- service level, while keeping a track of the challenge evolution in order to enable human-assisted diagnosis of flaws in the network. We propose to use state-of-art peer-to-peer publish/subscribe and distributed storage as core building blocks for the DISco service

    Multi-level network resilience: traffic analysis, anomaly detection and simulation

    Get PDF
    Traffic analysis and anomaly detection have been extensively used to characterize network utilization as well as to identify abnormal network traffic such as malicious attacks. However, so far, techniques for traffic analysis and anomaly detection have been carried out independently, relying on mechanisms and algorithms either in edge or in core networks alone. In this paper we propose the notion of multi-level network resilience, in order to provide a more buy pill robust traffic analysis and anomaly detection architecture, combining mechanisms and algorithms operating in a coordinated fashion both in the edge and in the core networks. This work is motivated by the potential complementarities between the research being developed at IIT Madras and Lancaster University. In this paper we describe the current work being developed at IIT Madras and Lancaster on traffic analysis and anomaly detection, and outline the principles of a multi-level resilience architecture

    Multi-level Network Resilience: Traffic Analysis, Anomaly Detection and Simulation

    Get PDF
    Traffic analysis and anomaly detection have been extensively used to characterize network utilization as well as to identify abnormal network traffic such as malicious attacks. However, so far, techniques for traffic analysis and anomaly detection have been carried out independently, relying on mechanisms and algorithms either in edge or in core networks alone. In this paper we propose the notion of multi-level network resilience, in order to provide a more robust traffic analysis and anomaly detection architecture, combining mechanisms and algorithms operating in a coordinated fashion both in the edge and in the core networks. This work is motivated by the potential complementarities between the research being developed at IIT Madras and Lancaster University. In this paper we describe the current work being developed at IIT Madras and Lancaster on traffic analysis and anomaly detection, and outline the principles of a multi-level resilience architecture

    On power system automation: a Digital Twin-centric framework for the next generation of energy management systems

    Get PDF
    The ubiquitous digital transformation also influences power system operation. Emerging real-time applications in information (IT) and operational technology (OT) provide new opportunities to address the increasingly demanding power system operation imposed by the progressing energy transition. This IT/OT convergence is epitomised by the novel Digital Twin (DT) concept. By integrating sensor data into analytical models and aligning the model states with the observed system, a power system DT can be created. As a result, a validated high-fidelity model is derived, which can be applied within the next generation of energy management systems (EMS) to support power system operation. By providing a consistent and maintainable data model, the modular DT-centric EMS proposed in this work addresses several key requirements of modern EMS architectures. It increases the situation awareness in the control room, enables the implementation of model maintenance routines, and facilitates automation approaches, while raising the confidence into operational decisions deduced from the validated model. This gain in trust contributes to the digital transformation and enables a higher degree of power system automation. By considering operational planning and power system operation processes, a direct link to practice is ensured. The feasibility of the concept is examined by numerical case studies.The electrical power system is in the process of an extensive transformation. Driven by the energy transition towards renewable energy resources, many conventional power plants in Germany have already been decommissioned or will be decommissioned within the next decade. Among other things, these changes lead to an increased utilisation of power transmission equipment, and an increasing number of complex dynamic phenomena. The resulting system operation closer to physical boundaries leads to an increased susceptibility to disturbances, and to a reduced time span to react to critical contingencies and perturbations. In consequence, the task to operate the power system will become increasingly demanding. As some reactions to disturbances may be required within timeframes that exceed human capabilities, these developments are intrinsic drivers to enable a higher degree of automation in power system operation. This thesis proposes a framework to create a modular Digital Twin-centric energy management system. It enables the provision of validated and trustworthy models built from knowledge about the power system derived from physical laws, and process data. As the interaction of information and operational technologies is combined in the concept of the Digital Twin, it can serve as a framework for future energy management systems including novel applications for power system monitoring and control, which consider power system dynamics. To provide a validated high-fidelity dynamic power system model, time-synchronised phasor measurements of high-resolution are applied for validation and parameter estimation. This increases the trust into the underlying power system model as well as the confidence into operational decisions derived from advanced analytic applications such as online dynamic security assessment. By providing an appropriate, consistent, and maintainable data model, the framework addresses several key requirements of modern energy management system architectures, while enabling the implementation of advanced automation routines and control approaches. Future energy management systems can provide an increased observability based on the proposed architecture, whereby the situational awareness of human operators in the control room can be improved. In further development stages, cognitive systems can be applied that are able to learn from the data provided, e.g., machine learning based analytical functions. Thus, the framework enables a higher degree of power system automation, as well as the deployment of assistance and decision support functions for power system operation pointing towards a higher degree of automation in power system operation. The framework represents a contribution to the digital transformation of power system operation and facilitates a successful energy transition. The feasibility of the concept is examined by case studies in form of numerical simulations to provide a proof of concept.Das elektrische Energiesystem befindet sich in einem umfangreichen Transformations-prozess. Durch die voranschreitende Energiewende und den zunehmenden Einsatz erneuerbarer Energieträger sind in Deutschland viele konventionelle Kraftwerke bereits stillgelegt worden oder werden in den nächsten Jahren stillgelegt. Diese Veränderungen führen unter anderem zu einer erhöhten Betriebsmittelauslastung sowie zu einer verringerten Systemträgheit und somit zu einer zunehmenden Anzahl komplexer dynamischer Phänomene im elektrischen Energiesystem. Der Betrieb des Systems näher an den physikalischen Grenzen führt des Weiteren zu einer erhöhten Störanfälligkeit und zu einer verkürzten Zeitspanne, um auf kritische Ereignisse und Störungen zu reagieren. Infolgedessen wird die Aufgabe, das Stromnetz zu betreiben anspruchsvoller. Insbesondere dort wo Reaktionszeiten erforderlich sind, welche die menschlichen Fähigkeiten übersteigen sind die zuvor genannten Veränderungen intrinsische Treiber hin zu einem höheren Automatisierungsgrad in der Netzbetriebs- und Systemführung. Aufkommende Echtzeitanwendungen in den Informations- und Betriebstechnologien und eine zunehmende Menge an hochauflösenden Sensordaten ermöglichen neue Ansätze für den Entwurf und den Betrieb von cyber-physikalischen Systemen. Ein vielversprechender Ansatz, der in jüngster Zeit in diesem Zusammenhang diskutiert wurde, ist das Konzept des so genannten Digitalen Zwillings. Da das Zusammenspiel von Informations- und Betriebstechnologien im Konzept des Digitalen Zwillings vereint wird, kann es als Grundlage für eine zukünftige Leitsystemarchitektur und neuartige Anwendungen der Leittechnik herangezogen werden. In der vorliegenden Arbeit wird ein Framework entwickelt, welches einen Digitalen Zwilling in einer neuartigen modularen Leitsystemarchitektur für die Aufgabe der Überwachung und Steuerung zukünftiger Energiesysteme zweckdienlich einsetzbar macht. In Ergänzung zu den bereits vorhandenen Funktionen moderner Netzführungssysteme unterstützt das Konzept die Abbildung der Netzdynamik auf Basis eines dynamischen Netzmodells. Um eine realitätsgetreue Abbildung der Netzdynamik zu ermöglichen, werden zeitsynchrone Raumzeigermessungen für die Modellvalidierung und Modellparameterschätzung herangezogen. Dies erhöht die Aussagekraft von Sicherheitsanalysen, sowie das Vertrauen in die Modelle mit denen operative Entscheidungen generiert werden. Durch die Bereitstellung eines validierten, konsistenten und wartbaren Datenmodells auf der Grundlage von physikalischen Gesetzmäßigkeiten und während des Betriebs gewonnener Prozessdaten, adressiert der vorgestellte Architekturentwurf mehrere Schlüsselan-forderungen an moderne Netzleitsysteme. So ermöglicht das Framework einen höheren Automatisierungsgrad des Stromnetzbetriebs sowie den Einsatz von Entscheidungs-unterstützungsfunktionen bis hin zu vertrauenswürdigen Assistenzsystemen auf Basis kognitiver Systeme. Diese Funktionen können die Betriebssicherheit erhöhen und stellen einen wichtigen Beitrag zur Umsetzung der digitalen Transformation des Stromnetzbetriebs, sowie zur erfolgreichen Umsetzung der Energiewende dar. Das vorgestellte Konzept wird auf der Grundlage numerischer Simulationen untersucht, wobei die grundsätzliche Machbarkeit anhand von Fallstudien nachgewiesen wird

    The Role of a Microservice Architecture on cybersecurity and operational resilience in critical systems

    Get PDF
    Critical systems are characterized by their high degree of intolerance to threats, in other words, their high level of resilience, because depending on the context in which the system is inserted, the slightest failure could imply significant damage, whether in economic terms, or loss of reputation, of information, of infrastructure, of the environment, or human life. The security of such systems is traditionally associated with legacy infrastructures and data centers that are monolithic, which translates into increasingly high evolution and protection challenges. In the current context of rapid transformation where the variety of threats to systems has been consistently increasing, this dissertation aims to carry out a compatibility study of the microservice architecture, which is denoted by its characteristics such as resilience, scalability, modifiability and technological heterogeneity, being flexible in structural adaptations, and in rapidly evolving and highly complex settings, making it suited for agile environments. It also explores what response artificial intelligence, more specifically machine learning, can provide in a context of security and monitorability when combined with a simple banking system that adopts the microservice architecture.Os sistemas críticos são caracterizados pelo seu elevado grau de intolerância às ameaças, por outras palavras, o seu alto nível de resiliência, pois dependendo do contexto onde se insere o sistema, a mínima falha poderá implicar danos significativos, seja em termos económicos, de perda de reputação, de informação, de infraestrutura, de ambiente, ou de vida humana. A segurança informática de tais sistemas está tradicionalmente associada a infraestruturas e data centers legacy, ou seja, de natureza monolítica, o que se traduz em desafios de evolução e proteção cada vez mais elevados. No contexto atual de rápida transformação, onde as variedades de ameaças aos sistemas têm vindo consistentemente a aumentar, esta dissertação visa realizar um estudo de compatibilidade da arquitetura de microserviços, que se denota pelas suas caraterísticas tais como a resiliência, escalabilidade, modificabilidade e heterogeneidade tecnológica, sendo flexível em adaptações estruturais, e em cenários de rápida evolução e elevada complexidade, tornando-a adequada a ambientes ágeis. Explora também a resposta que a inteligência artificial, mais concretamente, machine learning, pode dar num contexto de segurança e monitorabilidade quando combinado com um simples sistema bancário que adota uma arquitetura de microserviços

    Anomaly detection for resilience in cloud computing infrastructures

    Get PDF
    Cloud computing is a relatively recent model where scalable and elastic resources are provided as optimized, cost-effective and on-demand utility-like services to customers. As one of the major trends in the IT industry in recent years, cloud computing has gained momentum and started to revolutionise the way enterprises create and deliver IT solutions. Motivated primarily due to cost reduction, these cloud environments are also being used by Information and Communication Technologies (ICT) operating Critical Infrastructures (CI). However, due to the complex nature of underlying infrastructures, these environments are subject to a large number of challenges, including mis-configurations, cyber attacks and malware instances, which manifest themselves as anomalies. These challenges clearly reduce the overall reliability and availability of the cloud, i.e., it is less resilient to challenges. Resilience is intended to be a fundamental property of cloud service provisioning platforms. However, a number of significant challenges in the past demonstrated that cloud environments are not as resilient as one would hope. There is also limited understanding about how to provide resilience in the cloud that can address such challenges. This implies that it is of utmost importance to clearly understand and define what constitutes the correct, normal behaviour so that deviation from it can be detected as anomalies and consequently higher resilience can be achieved. Also, for characterising and identifying challenges, anomaly detection techniques can be used and this is due to the fact that the statistical models embodied in these techniques allow the robust characterisation of normal behaviour, taking into account various monitoring metrics to detect known and unknown patterns. These anomaly detection techniques can also be applied within a resilience framework in order to promptly provide indications and warnings about adverse events or conditions that may occur. However, due to the scale and complexity of cloud, detection based on continuous real time infrastructure monitoring becomes challenging. Because monitoring leads to an overwhelming volume of data, this adversely affects the ability of the underlying detection mechanisms to analyse the data. The increasing volume of metrics, compounded with complexity of infrastructure, may also cause low detection accuracy. In this thesis, a comprehensive evaluation of anomaly detection techniques in cloud infrastructures is presented under typical elastic behaviour. More specifically, an investigation of the impact of live virtual machine migration on state of the art anomaly detection techniques is carried out, by evaluating live migration under various attack types and intensities. An initial comparison concludes that, whilst many detection techniques have been proposed, none of them is suited to work within a cloud operational context. The results suggest that in some configurations anomalies are missed and some configuration anomalies are wrongly classified. Moreover, some of these approaches have been shown to be sensitive to parameters of the datasets such as the level of traffic aggregation, and they suffer from other robustness problems. In general, anomaly detection techniques are founded on specific assumptions about the data, for example the statistical distributions of events. If these assumptions do not hold, an outcome can be high false positive rates. Based on this initial study, the objective of this work is to establish a light-weight real time anomaly detection technique which is more suited to a cloud operational context by keeping low false positive rates without the need for prior knowledge and thus enabling the administrator to respond to threats effectively. Furthermore, a technique is needed which is robust to the properties of cloud infrastructures, such as elasticity and limited knowledge of the services, and such that it can support other resilience supporting mechanisms. From this formulation, a cloud resilience management framework is proposed which incorporates the anomaly detection and other supporting mechanisms that collectively address challenges that manifest themselves as anomalies. The framework is a holistic endto-end framework for resilience that considers both networking and system issues, and spans the various stages of an existing resilience strategy, called (D2R 2+DR). In regards to the operational applicability of detection mechanisms, a novel Anomaly Detection-as-a-Service (ADaaS) architecture has been modelled as the means to implement the detection technique. A series of experiments was conducted to assess the effectiveness of the proposed technique for ADaaS. These aimed to improve the viability of implementing the system in an operational context. Finally, the proposed model is deployed in a European Critical Infrastructure provider’s network running various critical services, and validated the results in real time scenarios with the use of various test cases, and finally demonstrating the advantages of such a model in an operational context. The obtained results show that anomalies are detectable with high accuracy with no prior-knowledge, and it can be concluded that ADaaS is applicable to cloud scenarios for a flexible multi-tenant detection systems, clearly establishing its effectiveness for cloud infrastructure resilience

    Digitalisation For Sustainable Infrastructure: The Road Ahead

    Get PDF
    In today’s tumultuous and fast-changing times, digitalisation and technology are game changers in a wide range of sectors and have a tremendous impact on infrastructure. Roads, railways, electricity grids, aviation, and maritime transport are deeply affected by the digital and technological transition, with gains in terms of competitiveness, cost-reduction, and safety. Digitalisation is also a key tool for fostering global commitment towards sustainability, but the race for digital infrastructure is also a geopolitical one. As the world’s largest economies are starting to adopt competitive strategies, a level playing field appears far from being agreed upon. Why are digitalisation and technology the core domains of global geopolitical competition? How are they changing the way infrastructure is built, operated, and maintained? To what extent will road, rail, air, and maritime transport change by virtue of digitalisation, artificial intelligence, and the Internet of Things? How to enhance cyber protection for critical infrastructure? What are the EU’s, US’ and China’s digital strategies?Publishe
    corecore