4,903 research outputs found
A Survey on Forensics and Compliance Auditing for Critical Infrastructure Protection
The broadening dependency and reliance that modern societies have on essential services
provided by Critical Infrastructures is increasing the relevance of their trustworthiness. However, Critical
Infrastructures are attractive targets for cyberattacks, due to the potential for considerable impact, not just
at the economic level but also in terms of physical damage and even loss of human life. Complementing
traditional security mechanisms, forensics and compliance audit processes play an important role in ensuring
Critical Infrastructure trustworthiness. Compliance auditing contributes to checking if security measures are
in place and compliant with standards and internal policies. Forensics assist the investigation of past security
incidents. Since these two areas significantly overlap, in terms of data sources, tools and techniques, they can
be merged into unified Forensics and Compliance Auditing (FCA) frameworks. In this paper, we survey the
latest developments, methodologies, challenges, and solutions addressing forensics and compliance auditing
in the scope of Critical Infrastructure Protection. This survey focuses on relevant contributions, capable of
tackling the requirements imposed by massively distributed and complex Industrial Automation and Control
Systems, in terms of handling large volumes of heterogeneous data (that can be noisy, ambiguous, and
redundant) for analytic purposes, with adequate performance and reliability. The achieved results produced
a taxonomy in the field of FCA whose key categories denote the relevant topics in the literature. Also, the
collected knowledge resulted in the establishment of a reference FCA architecture, proposed as a generic
template for a converged platform. These results are intended to guide future research on forensics and
compliance auditing for Critical Infrastructure Protection.info:eu-repo/semantics/publishedVersio
Deep generative models for network data synthesis and monitoring
Measurement and monitoring are fundamental tasks in all networks, enabling the down-stream management and optimization of the network.
Although networks inherently
have abundant amounts of monitoring data, its access and effective measurement is
another story. The challenges exist in many aspects. First, the inaccessibility of network monitoring data for external users, and it is hard to provide a high-fidelity dataset
without leaking commercial sensitive information. Second, it could be very expensive
to carry out effective data collection to cover a large-scale network system, considering the size of network growing, i.e., cell number of radio network and the number of
flows in the Internet Service Provider (ISP) network. Third, it is difficult to ensure fidelity and efficiency simultaneously in network monitoring, as the available resources
in the network element that can be applied to support the measurement function are
too limited to implement sophisticated mechanisms. Finally, understanding and explaining the behavior of the network becomes challenging due to its size and complex
structure. Various emerging optimization-based solutions (e.g., compressive sensing)
or data-driven solutions (e.g. deep learning) have been proposed for the aforementioned challenges. However, the fidelity and efficiency of existing methods cannot yet
meet the current network requirements.
The contributions made in this thesis significantly advance the state of the art in
the domain of network measurement and monitoring techniques. Overall, we leverage
cutting-edge machine learning technology, deep generative modeling, throughout the
entire thesis. First, we design and realize APPSHOT , an efficient city-scale network
traffic sharing with a conditional generative model, which only requires open-source
contextual data during inference (e.g., land use information and population distribution). Second, we develop an efficient drive testing system — GENDT, based on generative model, which combines graph neural networks, conditional generation, and quantified model uncertainty to enhance the efficiency of mobile drive testing. Third, we
design and implement DISTILGAN, a high-fidelity, efficient, versatile, and real-time
network telemetry system with latent GANs and spectral-temporal networks. Finally,
we propose SPOTLIGHT , an accurate, explainable, and efficient anomaly detection system of the Open RAN (Radio Access Network) system. The lessons learned through
this research are summarized, and interesting topics are discussed for future work in
this domain. All proposed solutions have been evaluated with real-world datasets and
applied to support different applications in real systems
Authentication enhancement in command and control networks: (a study in Vehicular Ad-Hoc Networks)
Intelligent transportation systems contribute to improved traffic safety by facilitating real time communication between vehicles. By using wireless channels for communication, vehicular networks are susceptible to a wide range of attacks, such as impersonation, modification, and replay. In this context, securing data exchange between intercommunicating terminals, e.g., vehicle-to-everything (V2X) communication, constitutes a technological challenge that needs to be addressed. Hence, message authentication is crucial to safeguard vehicular ad-hoc networks (VANETs) from malicious attacks. The current state-of-the-art for authentication in VANETs relies on conventional cryptographic primitives, introducing significant computation and communication overheads. In this challenging scenario, physical (PHY)-layer authentication has gained popularity, which involves leveraging the inherent characteristics of wireless channels and the hardware imperfections to discriminate between wireless devices. However, PHY-layerbased authentication cannot be an alternative to crypto-based methods as the initial legitimacy detection must be conducted using cryptographic methods to extract the communicating terminal secret features. Nevertheless, it can be a promising complementary solution for the reauthentication problem in VANETs, introducing what is known as “cross-layer authentication.” This thesis focuses on designing efficient cross-layer authentication schemes for VANETs, reducing the communication and computation overheads associated with transmitting and verifying a crypto-based signature for each transmission. The following provides an overview of the proposed methodologies employed in various contributions presented in this thesis.
1. The first cross-layer authentication scheme: A four-step process represents this approach: initial crypto-based authentication, shared key extraction, re-authentication via a PHY challenge-response algorithm, and adaptive adjustments based on channel conditions. Simulation results validate its efficacy, especially in low signal-to-noise ratio (SNR) scenarios while proving its resilience against active and passive attacks.
2. The second cross-layer authentication scheme: Leveraging the spatially and temporally correlated wireless channel features, this scheme extracts high entropy shared keys that can be used to create dynamic PHY-layer signatures for authentication. A 3-Dimensional (3D) scattering Doppler emulator is designed to investigate the scheme’s performance at different speeds of a moving vehicle and SNRs. Theoretical and hardware implementation analyses prove the scheme’s capability to support high detection probability for an acceptable false alarm value ≤ 0.1 at SNR ≥ 0 dB and speed ≤ 45 m/s.
3. The third proposal: Reconfigurable intelligent surfaces (RIS) integration for improved authentication: Focusing on enhancing PHY-layer re-authentication, this proposal explores integrating RIS technology to improve SNR directed at designated vehicles. Theoretical analysis and practical implementation of the proposed scheme are conducted using a 1-bit RIS, consisting of 64 × 64 reflective units. Experimental results show a significant improvement in the Pd, increasing from 0.82 to 0.96 at SNR = − 6 dB for multicarrier communications.
4. The fourth proposal: RIS-enhanced vehicular communication security: Tailored for challenging SNR in non-line-of-sight (NLoS) scenarios, this proposal optimises key extraction and defends against denial-of-service (DoS) attacks through selective signal strengthening. Hardware implementation studies prove its effectiveness, showcasing improved key extraction performance and resilience against potential threats.
5. The fifth cross-layer authentication scheme: Integrating PKI-based initial legitimacy detection and blockchain-based reconciliation techniques, this scheme ensures secure data exchange. Rigorous security analyses and performance evaluations using network simulators and computation metrics showcase its effectiveness, ensuring its resistance against common attacks and time efficiency in message verification.
6. The final proposal: Group key distribution: Employing smart contract-based blockchain technology alongside PKI-based authentication, this proposal distributes group session keys securely. Its lightweight symmetric key cryptography-based method maintains privacy in VANETs, validated via Ethereum’s main network (MainNet) and comprehensive computation and communication evaluations.
The analysis shows that the proposed methods yield a noteworthy reduction, approximately ranging from 70% to 99%, in both computation and communication overheads, as compared to the conventional approaches. This reduction pertains to the verification and transmission of 1000 messages in total
Monitoring tools for DevOps and microservices: A systematic grey literature review
Microservice-based systems are usually developed according to agile practices like DevOps, which enables rapid and frequent releases to promptly react and adapt to changes. Monitoring is a key enabler for these systems, as they allow to continuously get feedback from the field and support timely and tailored decisions for a quality-driven evolution. In the realm of monitoring tools available for microservices in the DevOps-driven development practice, each with different features, assumptions, and performance, selecting a suitable tool is an as much difficult as impactful task.
This article presents the results of a systematic study of the grey literature we performed to identify, classify and analyze the available monitoring tools for DevOps and microservices. We selected and examined a list of 71 monitoring tools, drawing a map of their characteristics, limitations, assumptions, and open challenges, meant to be useful to both researchers and practitioners working in this area. Results are publicly available and replicable
Online semi-supervised learning in non-stationary environments
Existing Data Stream Mining (DSM) algorithms assume the availability of labelled and
balanced data, immediately or after some delay, to extract worthwhile knowledge from the
continuous and rapid data streams. However, in many real-world applications such as
Robotics, Weather Monitoring, Fraud Detection Systems, Cyber Security, and Computer
Network Traffic Flow, an enormous amount of high-speed data is generated by Internet of
Things sensors and real-time data on the Internet. Manual labelling of these data streams
is not practical due to time consumption and the need for domain expertise. Another
challenge is learning under Non-Stationary Environments (NSEs), which occurs due to
changes in the data distributions in a set of input variables and/or class labels. The problem
of Extreme Verification Latency (EVL) under NSEs is referred to as Initially Labelled Non-Stationary Environment (ILNSE). This is a challenging task because the learning algorithms
have no access to the true class labels directly when the concept evolves. Several approaches
exist that deal with NSE and EVL in isolation. However, few algorithms address both issues
simultaneously. This research directly responds to ILNSE’s challenge in proposing two
novel algorithms “Predictor for Streaming Data with Scarce Labels” (PSDSL) and
Heterogeneous Dynamic Weighted Majority (HDWM) classifier. PSDSL is an Online Semi-Supervised Learning (OSSL) method for real-time DSM and is closely related to label
scarcity issues in online machine learning.
The key capabilities of PSDSL include learning from a small amount of labelled data in an
incremental or online manner and being available to predict at any time. To achieve this,
PSDSL utilises both labelled and unlabelled data to train the prediction models, meaning it
continuously learns from incoming data and updates the model as new labelled or
unlabelled data becomes available over time. Furthermore, it can predict under NSE
conditions under the scarcity of class labels. PSDSL is built on top of the HDWM classifier,
which preserves the diversity of the classifiers. PSDSL and HDWM can intelligently switch
and adapt to the conditions. The PSDSL adapts to learning states between self-learning,
micro-clustering and CGC, whichever approach is beneficial, based on the characteristics of
the data stream. HDWM makes use of “seed” learners of different types in an ensemble to
maintain its diversity. The ensembles are simply the combination of predictive models
grouped to improve the predictive performance of a single classifier.
PSDSL is empirically evaluated against COMPOSE, LEVELIW, SCARGC and MClassification
on benchmarks, NSE datasets as well as Massive Online Analysis (MOA) data streams and real-world datasets. The results showed that PSDSL performed significantly better than
existing approaches on most real-time data streams including randomised data instances.
PSDSL performed significantly better than ‘Static’ i.e. the classifier is not updated after it is
trained with the first examples in the data streams. When applied to MOA-generated data
streams, PSDSL ranked highest (1.5) and thus performed significantly better than SCARGC,
while SCARGC performed the same as the Static. PSDSL achieved better average prediction
accuracies in a short time than SCARGC.
The HDWM algorithm is evaluated on artificial and real-world data streams against existing
well-known approaches such as the heterogeneous WMA and the homogeneous Dynamic
DWM algorithm. The results showed that HDWM performed significantly better than WMA
and DWM. Also, when recurring concept drifts were present, the predictive performance of
HDWM showed an improvement over DWM. In both drift and real-world streams,
significance tests and post hoc comparisons found significant differences between
algorithms, HDWM performed significantly better than DWM and WMA when applied to
MOA data streams and 4 real-world datasets Electric, Spam, Sensor and Forest cover. The
seeding mechanism and dynamic inclusion of new base learners in the HDWM algorithms
benefit from the use of both forgetting and retaining the models. The algorithm also
provides the independence of selecting the optimal base classifier in its ensemble depending
on the problem.
A new approach, Envelope-Clustering is introduced to resolve the cluster overlap conflicts
during the cluster labelling process. In this process, PSDSL transforms the centroids’
information of micro-clusters into micro-instances and generates new clusters called
Envelopes. The nearest envelope clusters assist the conflicted micro-clusters and
successfully guide the cluster labelling process after the concept drifts in the absence of true
class labels. PSDSL has been evaluated on real-world problem ‘keystroke dynamics’, and
the results show that PSDSL achieved higher prediction accuracy (85.3%) and SCARGC
(81.6%), while the Static (49.0%) significantly degrades the performance due to changes in
the users typing pattern. Furthermore, the predictive accuracies of SCARGC are found
highly fluctuated between (41.1% to 81.6%) based on different values of parameter ‘k’
(number of clusters), while PSDSL automatically determine the best values for this
parameter
AI Lifecycle Zero-Touch Orchestration within the Edge-to-Cloud Continuum for Industry 5.0
The advancements in human-centered artificial intelligence (HCAI) systems for Industry 5.0 is a new phase of industrialization that places the worker at the center of the production process and uses new technologies to increase prosperity beyond jobs and growth. HCAI presents new objectives that were unreachable by either humans or machines alone, but this also comes with a new set of challenges. Our proposed method accomplishes this through the knowlEdge architecture, which enables human operators to implement AI solutions using a zero-touch framework. It relies on containerized AI model training and execution, supported by a robust data pipeline and rounded off with human feedback and evaluation interfaces. The result is a platform built from a number of components, spanning all major areas of the AI lifecycle. We outline both the architectural concepts and implementation guidelines and explain how they advance HCAI systems and Industry 5.0. In this article, we address the problems we encountered while implementing the ideas within the edge-to-cloud continuum. Further improvements to our approach may enhance the use of AI in Industry 5.0 and strengthen trust in AI systems
Recommended from our members
Methane Production by Methanogens In Simulated Subsurface Martian Environments
Methane has a typical atmospheric photochemical lifetime of ∼300 years on Mars, making contemporary reported detections (and non-detections) of methane a fiercely debated topic, due to the potential need for a present-day source. On Earth, most methane is produced by methanogenic microbes present in, e.g., ruminants, wetlands, lakes and permafrost. Of the four metabolic pathways on Earth, the hydrogenotrophic pathway is the most common, utilising CO2 and H2 as substrates. Both gases are present on Mars, plus liquid water and essential elements (CHNOPS) that are requirements for life, and organics. Surface conditions on Mars are sterilising, however, the temperature and pressure of the subsurface are potentially favourable to life and provide a shield to sterilising surface conditions, and are thus a possible habitat for methanogens. A meta-analysis was
conducted, motivated by these subsurface parameters, that redefined the statistical representation for several growth parameters for all type-strains of methanogens and analysed multiple parameters
simultaneously across multiple categories (e.g. metabolism), showing that the optimal average conditions in which to grow methanogens would be a meso-temperate (20 to 39◦C), hypersaline and slightly acidic environment. Two martian subsurface environments were simulated to determine whether environmental or chemical factors are inhibitory to methanogenesis. (1) Methanoculleus marisnigri was grown in a custom-built, high-pressure manifold at 60 bar and 25◦C to simulate the subsurface of Mars, although no methane was produced, due to a technical issue resulting in oxygenated medium. However, some cells survived five weeks of oxygenation. (2) Methanothermococcus okinawensis was grown in a simulated chemical environment at 1 bar and 60◦C, that included a regolith simulant, a proposed martian brine and a Mars-relevant organic source (carbonaceous chondrite). The simulated parameters of the chemical environment of subsurface Mars were not inhibitory to hydrogenotrophic methanogenesis, suggesting it is feasible (from a metabolic perspective) that subsurface methanogens could be producing contemporary methane on Mars
LIPIcs, Volume 251, ITCS 2023, Complete Volume
LIPIcs, Volume 251, ITCS 2023, Complete Volum
ARGUS: Context-Based Detection of Stealthy IoT Infiltration Attacks
IoT application domains, device diversity and connectivity are rapidly
growing. IoT devices control various functions in smart homes and buildings,
smart cities, and smart factories, making these devices an attractive target
for attackers. On the other hand, the large variability of different
application scenarios and inherent heterogeneity of devices make it very
challenging to reliably detect abnormal IoT device behaviors and distinguish
these from benign behaviors. Existing approaches for detecting attacks are
mostly limited to attacks directly compromising individual IoT devices, or,
require predefined detection policies. They cannot detect attacks that utilize
the control plane of the IoT system to trigger actions in an
unintended/malicious context, e.g., opening a smart lock while the smart home
residents are absent.
In this paper, we tackle this problem and propose ARGUS, the first
self-learning intrusion detection system for detecting contextual attacks on
IoT environments, in which the attacker maliciously invokes IoT device actions
to reach its goals. ARGUS monitors the contextual setting based on the state
and actions of IoT devices in the environment. An unsupervised Deep Neural
Network (DNN) is used for modeling the typical contextual device behavior and
detecting actions taking place in abnormal contextual settings. This
unsupervised approach ensures that ARGUS is not restricted to detecting
previously known attacks but is also able to detect new attacks. We evaluated
ARGUS on heterogeneous real-world smart-home settings and achieve at least an
F1-Score of 99.64% for each setup, with a false positive rate (FPR) of at most
0.03%.Comment: To appear in the 32nd USENIX Security Symposium, August 2022, Anaheim
CA, US
The Application of Data Analytics Technologies for the Predictive Maintenance of Industrial Facilities in Internet of Things (IoT) Environments
In industrial production environments, the maintenance of equipment has a decisive influence on costs and on the plannability of production capacities. In particular, unplanned failures during production times cause high costs, unplanned downtimes and possibly additional collateral damage. Predictive Maintenance starts here and tries to predict a possible failure and its cause so early that its prevention can be prepared and carried out in time. In order to be able to predict malfunctions and failures, the industrial plant with its characteristics, as well as wear and ageing processes, must be modelled. Such modelling can be done by replicating its physical properties. However, this is very complex and requires enormous expert knowledge about the plant and about wear and ageing processes of each individual component. Neural networks and machine learning make it possible to train such models using data and offer an alternative, especially when very complex and non-linear behaviour is evident.
In order for models to make predictions, as much data as possible about the condition of a plant and its environment and production planning data is needed. In Industrial Internet of Things (IIoT) environments, the amount of available data is constantly increasing. Intelligent sensors and highly interconnected production facilities produce a steady stream of data. The sheer volume of data, but also the steady stream in which data is transmitted, place high demands on the data processing systems. If a participating system wants to perform live analyses on the incoming data streams, it must be able to process the incoming data at least as fast as the continuous data stream delivers it. If this is not the case, the system falls further and further behind in processing and thus in its analyses. This also applies to Predictive Maintenance systems, especially if they use complex and computationally intensive machine learning models. If sufficiently scalable hardware resources are available, this may not be a problem at first. However, if this is not the case or if the processing takes place on decentralised units with limited hardware resources (e.g. edge devices), the runtime behaviour and resource requirements of the type of neural network used can become an important criterion.
This thesis addresses Predictive Maintenance systems in IIoT environments using neural networks and Deep Learning, where the runtime behaviour and the resource requirements are relevant. The question is whether it is possible to achieve better runtimes with similarly result quality using a new type of neural network. The focus is on reducing the complexity of the network and improving its parallelisability. Inspired by projects in which complexity was distributed to less complex neural subnetworks by upstream measures, two hypotheses presented in this thesis emerged: a) the distribution of complexity into simpler subnetworks leads to faster processing overall, despite the overhead this creates, and b) if a neural cell has a deeper internal structure, this leads to a less complex network. Within the framework of a qualitative study, an overall impression of Predictive Maintenance applications in IIoT environments using neural networks was developed. Based on the findings, a novel model layout was developed named Sliced Long Short-Term Memory Neural Network (SlicedLSTM). The SlicedLSTM implements the assumptions made in the aforementioned hypotheses in its inner model architecture.
Within the framework of a quantitative study, the runtime behaviour of the SlicedLSTM was compared with that of a reference model in the form of laboratory tests. The study uses synthetically generated data from a NASA project to predict failures of modules of aircraft gas turbines. The dataset contains 1,414 multivariate time series with 104,897 samples of test data and 160,360 samples of training data.
As a result, it could be proven for the specific application and the data used that the SlicedLSTM delivers faster processing times with similar result accuracy and thus clearly outperforms the reference model in this respect. The hypotheses about the influence of complexity in the internal structure of the neuronal cells were confirmed by the study carried out in the context of this thesis
- …