7,057 research outputs found

    A Survey on Forensics and Compliance Auditing for Critical Infrastructure Protection

    Get PDF
    The broadening dependency and reliance that modern societies have on essential services provided by Critical Infrastructures is increasing the relevance of their trustworthiness. However, Critical Infrastructures are attractive targets for cyberattacks, due to the potential for considerable impact, not just at the economic level but also in terms of physical damage and even loss of human life. Complementing traditional security mechanisms, forensics and compliance audit processes play an important role in ensuring Critical Infrastructure trustworthiness. Compliance auditing contributes to checking if security measures are in place and compliant with standards and internal policies. Forensics assist the investigation of past security incidents. Since these two areas significantly overlap, in terms of data sources, tools and techniques, they can be merged into unified Forensics and Compliance Auditing (FCA) frameworks. In this paper, we survey the latest developments, methodologies, challenges, and solutions addressing forensics and compliance auditing in the scope of Critical Infrastructure Protection. This survey focuses on relevant contributions, capable of tackling the requirements imposed by massively distributed and complex Industrial Automation and Control Systems, in terms of handling large volumes of heterogeneous data (that can be noisy, ambiguous, and redundant) for analytic purposes, with adequate performance and reliability. The achieved results produced a taxonomy in the field of FCA whose key categories denote the relevant topics in the literature. Also, the collected knowledge resulted in the establishment of a reference FCA architecture, proposed as a generic template for a converged platform. These results are intended to guide future research on forensics and compliance auditing for Critical Infrastructure Protection.info:eu-repo/semantics/publishedVersio

    Single-cell time-series analysis of metabolic rhythms in yeast

    Get PDF
    The yeast metabolic cycle (YMC) is a biological rhythm in budding yeast (Saccharomyces cerevisiae). It entails oscillations in the concentrations and redox states of intracellular metabolites, oscillations in transcript levels, temporal partitioning of biosynthesis, and, in chemostats, oscillations in oxygen consumption. Most studies on the YMC have been based on chemostat experiments, and it is unclear whether YMCs arise from interactions between cells or are generated independently by each cell. This thesis aims at characterising the YMC in single cells and its response to nutrient and genetic perturbations. Specifically, I use microfluidics to trap and separate yeast cells, then record the time-dependent intensity of flavin autofluorescence, which is a component of the YMC. Single-cell microfluidics produces a large amount of time series data. Noisy and short time series produced from biological experiments restrict the computational tools that are useful for analysis. I developed a method to filter time series, a machine learning model to classify whether time series are oscillatory, and an autocorrelation method to examine the periodicity of time series data. My experimental results show that yeast cells show oscillations in the fluorescence of flavins. Specifically, I show that in high glucose conditions, cells generate flavin oscillations asynchronously within a population, and these flavin oscillations couple with the cell division cycle. I show that cells can individually reset the phase of their flavin oscillations in response to abrupt nutrient changes, independently of the cell division cycle. I also show that deletion strains generate flavin oscillations that exhibit different behaviour from dissolved oxygen oscillations from chemostat conditions. Finally, I use flux balance analysis to address whether proteomic constraints in cellular metabolism mean that temporal partitioning of biosynthesis is advantageous for the yeast cell, and whether such partitioning explains the timing of the metabolic cycle. My results show that under proteomic constraints, it is advantageous for the cell to sequentially synthesise biomass components because doing so shortens the timescale of biomass synthesis. However, the degree of advantage of sequential over parallel biosynthesis is lower when both carbon and nitrogen sources are limiting. This thesis thus confirms autonomous generation of flavin oscillations, and suggests a model in which the YMC responds to nutrient conditions and subsequently entrains the cell division cycle. It also emphasises the possibility that subpopulations in the culture explain chemostat-based observations of the YMC. Furthermore, this thesis paves the way for using computational methods to analyse large datasets of oscillatory time series, which is useful for various fields of study beyond the YMC

    Multidisciplinary perspectives on Artificial Intelligence and the law

    Get PDF
    This open access book presents an interdisciplinary, multi-authored, edited collection of chapters on Artificial Intelligence (‘AI’) and the Law. AI technology has come to play a central role in the modern data economy. Through a combination of increased computing power, the growing availability of data and the advancement of algorithms, AI has now become an umbrella term for some of the most transformational technological breakthroughs of this age. The importance of AI stems from both the opportunities that it offers and the challenges that it entails. While AI applications hold the promise of economic growth and efficiency gains, they also create significant risks and uncertainty. The potential and perils of AI have thus come to dominate modern discussions of technology and ethics – and although AI was initially allowed to largely develop without guidelines or rules, few would deny that the law is set to play a fundamental role in shaping the future of AI. As the debate over AI is far from over, the need for rigorous analysis has never been greater. This book thus brings together contributors from different fields and backgrounds to explore how the law might provide answers to some of the most pressing questions raised by AI. An outcome of the Católica Research Centre for the Future of Law and its interdisciplinary working group on Law and Artificial Intelligence, it includes contributions by leading scholars in the fields of technology, ethics and the law.info:eu-repo/semantics/publishedVersio

    Fairness-aware Machine Learning in Educational Data Mining

    Get PDF
    Fairness is an essential requirement of every educational system, which is reflected in a variety of educational activities. With the extensive use of Artificial Intelligence (AI) and Machine Learning (ML) techniques in education, researchers and educators can analyze educational (big) data and propose new (technical) methods in order to support teachers, students, or administrators of (online) learning systems in the organization of teaching and learning. Educational data mining (EDM) is the result of the application and development of data mining (DM), and ML techniques to deal with educational problems, such as student performance prediction and student grouping. However, ML-based decisions in education can be based on protected attributes, such as race or gender, leading to discrimination of individual students or subgroups of students. Therefore, ensuring fairness in ML models also contributes to equity in educational systems. On the other hand, bias can also appear in the data obtained from learning environments. Hence, bias-aware exploratory educational data analysis is important to support unbiased decision-making in EDM. In this thesis, we address the aforementioned issues and propose methods that mitigate discriminatory outcomes of ML algorithms in EDM tasks. Specifically, we make the following contributions: We perform bias-aware exploratory analysis of educational datasets using Bayesian networks to identify the relationships among attributes in order to understand bias in the datasets. We focus the exploratory data analysis on features having a direct or indirect relationship with the protected attributes w.r.t. prediction outcomes. We perform a comprehensive evaluation of the sufficiency of various group fairness measures in predictive models for student performance prediction problems. A variety of experiments on various educational datasets with different fairness measures are performed to provide users with a broad view of unfairness from diverse aspects. We deal with the student grouping problem in collaborative learning. We introduce the fair-capacitated clustering problem that takes into account cluster fairness and cluster cardinalities. We propose two approaches, namely hierarchical clustering and partitioning-based clustering, to obtain fair-capacitated clustering. We introduce the multi-fair capacitated (MFC) students-topics grouping problem that satisfies students' preferences while ensuring balanced group cardinalities and maximizing the diversity of members regarding the protected attribute. We propose three approaches: a greedy heuristic approach, a knapsack-based approach using vanilla maximal 0-1 knapsack formulation, and an MFC knapsack approach based on group fairness knapsack formulation. In short, the findings described in this thesis demonstrate the importance of fairness-aware ML in educational settings. We show that bias-aware data analysis, fairness measures, and fairness-aware ML models are essential aspects to ensure fairness in EDM and the educational environment.Ministry of Science and Culture of Lower Saxony/LernMINT/51410078/E

    Natural and Technological Hazards in Urban Areas

    Get PDF
    Natural hazard events and technological accidents are separate causes of environmental impacts. Natural hazards are physical phenomena active in geological times, whereas technological hazards result from actions or facilities created by humans. In our time, combined natural and man-made hazards have been induced. Overpopulation and urban development in areas prone to natural hazards increase the impact of natural disasters worldwide. Additionally, urban areas are frequently characterized by intense industrial activity and rapid, poorly planned growth that threatens the environment and degrades the quality of life. Therefore, proper urban planning is crucial to minimize fatalities and reduce the environmental and economic impacts that accompany both natural and technological hazardous events

    Linking Datasets on Organizations Using Half A Billion Open Collaborated Records

    Full text link
    Scholars studying organizations often work with multiple datasets lacking shared unique identifiers or covariates. In such situations, researchers may turn to approximate string matching methods to combine datasets. String matching, although useful, faces fundamental challenges. Even when two strings appear similar to humans, fuzzy matching often does not work because it fails to adapt to the informativeness of the character combinations presented. Worse, many entities have multiple names that are dissimilar (e.g., "Fannie Mae" and "Federal National Mortgage Association"), a case where string matching has little hope of succeeding. This paper introduces data from a prominent employment-related networking site (LinkedIn) as a tool to address these problems. We propose interconnected approaches to leveraging the massive amount of information from LinkedIn regarding organizational name-to-name links. The first approach builds a machine learning model for predicting matches from character strings, treating the trillions of user-contributed organizational name pairs as a training corpus: this approach constructs a string matching metric that explicitly maximizes match probabilities. A second approach identifies relationships between organization names using network representations of the LinkedIn data. A third approach combines the first and second. We document substantial improvements over fuzzy matching in applications, making all methods accessible in open-source software ("LinkOrgs")

    A systematic literature review on source code similarity measurement and clone detection: techniques, applications, and challenges

    Full text link
    Measuring and evaluating source code similarity is a fundamental software engineering activity that embraces a broad range of applications, including but not limited to code recommendation, duplicate code, plagiarism, malware, and smell detection. This paper proposes a systematic literature review and meta-analysis on code similarity measurement and evaluation techniques to shed light on the existing approaches and their characteristics in different applications. We initially found over 10000 articles by querying four digital libraries and ended up with 136 primary studies in the field. The studies were classified according to their methodology, programming languages, datasets, tools, and applications. A deep investigation reveals 80 software tools, working with eight different techniques on five application domains. Nearly 49% of the tools work on Java programs and 37% support C and C++, while there is no support for many programming languages. A noteworthy point was the existence of 12 datasets related to source code similarity measurement and duplicate codes, of which only eight datasets were publicly accessible. The lack of reliable datasets, empirical evaluations, hybrid methods, and focuses on multi-paradigm languages are the main challenges in the field. Emerging applications of code similarity measurement concentrate on the development phase in addition to the maintenance.Comment: 49 pages, 10 figures, 6 table

    Technology for Low Resolution Space Based RSO Detection and Characterisation

    Get PDF
    Space Situational Awareness (SSA) refers to all activities to detect, identify and track objects in Earth orbit. SSA is critical to all current and future space activities and protect space assets by providing access control, conjunction warnings, and monitoring status of active satellites. Currently SSA methods and infrastructure are not sufficient to account for the proliferations of space debris. In response to the need for better SSA there has been many different areas of research looking to improve SSA most of the requiring dedicated ground or space-based infrastructure. In this thesis, a novel approach for the characterisation of RSO’s (Resident Space Objects) from passive low-resolution space-based sensors is presented with all the background work performed to enable this novel method. Low resolution space-based sensors are common on current satellites, with many of these sensors being in space using them passively to detect RSO’s can greatly augment SSA with out expensive infrastructure or long lead times. One of the largest hurtles to overcome with research in the area has to do with the lack of publicly available labelled data to test and confirm results with. To overcome this hurtle a simulation software, ORBITALS, was created. To verify and validate the ORBITALS simulator it was compared with the Fast Auroral Imager images, which is one of the only publicly available low-resolution space-based images found with auxiliary data. During the development of the ORBITALS simulator it was found that the generation of these simulated images are computationally intensive when propagating the entire space catalog. To overcome this an upgrade of the currently used propagation method, Specialised General Perturbation Method 4th order (SGP4), was performed to allow the algorithm to run in parallel reducing the computational time required to propagate entire catalogs of RSO’s. From the results it was found that the standard facet model with a particle swarm optimisation performed the best estimating an RSO’s attitude with a 0.66 degree RMSE accuracy across a sequence, and ~1% MAPE accuracy for the optical properties. This accomplished this thesis goal of demonstrating the feasibility of low-resolution passive RSO characterisation from space-based platforms in a simulated environment

    Fault diagnosis in aircraft fuel system components with machine learning algorithms

    Get PDF
    There is a high demand and interest in considering the social and environmental effects of the component’s lifespan. Aircraft are one of the most high-priced businesses that require the highest reliability and safety constraints. The complexity of aircraft systems designs also has advanced rapidly in the last decade. Consequently, fault detection, diagnosis and modification/ repair procedures are becoming more challenging. The presence of a fault within an aircraft system can result in changes to system performances and cause operational downtime or accidents in a worst-case scenario. The CBM method that predicts the state of the equipment based on data collected is widely used in aircraft MROs. CBM uses diagnostics and prognostics models to make decisions on appropriate maintenance actions based on the Remaining Useful Life (RUL) of the components. The aircraft fuel system is a crucial system of aircraft, even a minor failure in the fuel system can affect the aircraft's safety greatly. A failure in the fuel system that impacts the ability to deliver fuel to the engine will have an immediate effect on system performance and safety. There are very few diagnostic systems that monitor the health of the fuel system and even fewer that can contain detected faults. The fuel system is crucial for the operation of the aircraft, in case of failure, the fuel in the aircraft will become unusable/unavailable to reach the destination. It is necessary to develop fault detection of the aircraft fuel system. The future aircraft fuel system must have the function of fault detection. Through the information of sensors and Machine Learning Techniques, the aircraft fuel system’s fault type can be detected in a timely manner. This thesis discusses the application of a Data-driven technique to analyse the healthy and faulty data collected using the aircraft fuel system model, which is similar to Boeing-777. The data is collected is processed through Machine learning Techniques and the results are comparedPhD in Manufacturin
    • …
    corecore