    Detector Design Considerations in High-Dimensional Artificial Immune Systems

    This research lays the groundwork for a network intrusion detection system that can operate with only knowledge of normal network traffic, using a process known as anomaly detection. Real-valued negative selection (RNS) is a specific anomaly detection algorithm that can be used to perform two-class classification when only one class is available for training. Researchers have shown fundamental problems with the most common detector shape, hyperspheres, in high-dimensional space. The research contained herein shows that the second most common detector type, hypercubes, can also cause problems due to biasing certain features in high dimensions. To address these problems, a new detector shape, the hypersteinmetz solid, is proposed, the goal of which is to provide a tradeoff between the problems plaguing hyperspheres and hypercubes. In order to investigate the potential benefits of the hypersteinmetz solid, an effective RNS detector size range is determined. Then, the relationship between content coverage of a dataset and classification accuracy is investigated. Subsequently, this research shows the tradeoffs that take place in high-dimensional data when hypersteinmetzes are chosen over hyperspheres or hypercubes. The experimental results show that detector shape is the dominant factor toward classification accuracy in high-dimensional RNS

    Kernel Extended Real-Valued Negative Selection Algorithm (KERNSA)

    Artificial Immune Systems (AISs) are a type of statistical Machine Learning (ML) algorithm based on the Biological Immune System (BIS) applied to classification problems. Inspired by increased performance in other ML algorithms when combined with kernel methods, this research explores using kernel methods as the distance measure for a specific AIS algorithm, the Real-valued Negative Selection Algorithm (RNSA). This research also demonstrates that the hard binary decision from the traditional RNSA can be relaxed to a continuous output, while maintaining the ability to map back to the original RNSA decision boundary if necessary. Continuous output is used in this research to generate Receiver Operating Characteristic (ROC) curves and calculate Area Under Curves (AUCs), but can also be used as a basis of classification confidence or probability. The resulting Kernel Extended Real-valued Negative Selection Algorithm (KERNSA) offers performance improvements over a comparable RNSA implementation. Using the Sigmoid kernel in KERNSA seems particularly well suited (in terms of performance) to four out of the eighteen domains tested

    D-AREdevil: a novel approach for discovering disease-associated rare cell populations in mass cytometry data

    Background: The advances in single-cell technologies such as mass cytometry provides increasing resolution of the complexity of cellular samples, allowing researchers to deeper investigate and understand the cellular heterogeneity and possibly detect and discover previously undetectable rare cell populations. The identification of rare cell populations is of paramount importance for understanding the onset, progression and pathogenesis of many diseases. However, their identification remains challenging due to the always increasing dimensionality and throughput of the data generated. Aim: This study aimed at implementing a straightforward approach that efficiently supports a data analyst to identify disease-associated rare cell populations in large and complex biological samples and within reasonable limits of time and computational infrastructure. Methods: We proposed a novel computational framework called D-AREdevil (disease- associated rare cells detection) for cytometry datasets. The main characteristic of our computational framework is the combination of an anomaly detection algorithm (i.e. LOF, or FiRE) that provides a continuous score for individual cells with one of the best performing and fastest unsupervised clustering methods (i.e. FlowSOM). In our approach, the LOF score serves to select a set of candidate cells belonging to one or more subgroups of similar rare cell populations. Then, we tested these subgroups of rare cells for association with a patient group, disease type, clinical outcome or other characteristic of interest. Results: We reported in this study the properties and implementation of D-AREdevil and presented an evaluation of its performances and applications on three different testing datasets based on mass cytometry data. We generated data mixed with one or more known rare cell populations at varying frequencies (below 1%) and tested the ability of our approach to identify those cells in order to bring them to the attention of the data analyst. This is a key step in the process of finding cell subgroups that are associated with a disease or outcome of interest, when their existence and identification is not previously known and has yet to be discovered. Conclusions: We proposed a novel computational framework with demostrated good sensitivity and precision in detecting target rare cell poopulations present at very low frequencies in the total datasets (<1%). -- Contexte: Les avancées en technologies sur cellules individuelles telles que la cytométrie de masse offrent une meilleure résolution de la complexité des échantillons cellulaires, permettant aux chercheurs d’étudier et de comprendre plus en profondeur l’hétérogénéité cellulaire et éventuellement de détecter et découvrir des populations de cellules rares auparavant indétectables. L’identification de populations de cellules rares est importante pour comprendre l’apparition, la progression et la pathogenèse de nombreuses maladies. Cependant, leur identification reste difficile en raison de la haute dimensionnalité et du débit toujours croissants de données générées. But: Cette étude met en œuvre une approche simple et efficace pour identifier des populations de cellules rares associées à une maladie dans des échantillons biologiques vastes et complexes dans des limites de temps et d’infrastructure de calcul raisonnables. Méthodes: Nous proposons un nouveau cadre de calcul appelé D-AREdevil (détection de cellules rares associées à une maladie) pour l’analyse de données de cytométrie de masse. La principale caractéristique de notre cadre computationnel est la combinaison d’un algorithme de détection d’anomalies (LOF ou FiRE) qui fournit un score continu pour chaque cellule avec l’une des méthodes de regroupement non-supervisé les plus performantes et les plus rapides (FlowSOM). Dans notre approche, le score LOF sert à sélectionner un ensemble de cellules candidates appartenant à un ou plusieurs sous-groupes de populations de cellules rares similaires. Ensuite, nous testons ces sous-groupes de cellules rares pour déterminer s’ils sont associées avec un groupe de patients, un type de maladie, un résultat clinique ou une autre caractéristique d’intérêt. Résultats: Dans cette étude, nous avons rapporté les propriétés et l’implémentation de D-AREdevil, et présenté une évaluation de ses performances et applications sur trois jeux de données différents de cytométrie de masse. Nous avons généré des données mélangées contenant une ou plusieurs populations de cellules rares connues à des fréquences variables (inférieures à 1%) et nous avons testé la capacité de notre approche à identifier ces cellules afin de les porter à l’attention de l’analyste. Il s’agit là d’une étape clé dans le processus de recherche de sous-groupes de cellules qui sont associés à une maladie ou à un résultat d’intérêt qui est encore inconnu. Conclusions: Nous proposons un nouveau cadre de calcul avec une bonne sensibilité et une bonne précision dans la détection de cellules rares qui sont présentes à de très basses fréquences dans l’ensemble des données (<1%)

    Development and evaluation of a fault detection and identification scheme for the WVU YF-22 UAV using the artificial immune system approach

    A failure detection and identification (FDI) scheme is developed for a small remotely controlled jet aircraft based on the Artificial Immune System (AIS) paradigm. Pilot-in-the-loop flight data are used to develop and test a scheme capable of identifying known and unknown aircraft actuator and sensor failures. Negative selection is used as the main mechanism for self/non-self definition; however, an alternative approach using positive selection to enhance performance is also presented. Tested failures include aileron and stabilator locked at trim and angular rate sensor bias. Hyper-spheres are chosen to represent detectors. Different definitions of distance for the matching rules are applied and their effect on the behavior of hyper-bodies is discussed. All the steps involved in the creation of the scheme are presented including design selections embedded in the different algorithms applied to generate the detectors set. The evaluation of the scheme is performed in terms of detection rate, false alarms, and detection time for normal conditions and upset conditions. The proposed detection scheme achieves good detection performance for all flight conditions considered. This approach proves promising potential to cope with the multidimensional characteristics of integrated/comprehensive detection for aircraft sub-system failures.;A preliminary performance comparison between an AIS based FDI scheme and a Neural Network and Floating Threshold based one is presented including groundwork on assessing possible improvements on pilot situational awareness aided by FDI schemes. Initial results favor the AIS approach to FDI due to its rather undemanding adaptation capabilities to new environments. The presence of the FDI scheme suggests benefits for the interaction between the pilot and the upset conditions by improving the accuracy of the identification of each particular failure and decreasing the detection delays

    An Online Adaptive Machine Learning Framework for Autonomous Fault Detection

    The increasing complexity and autonomy of modern systems, particularly in the aerospace industry, demand robust and adaptive fault detection and health management solutions. The development of a data-driven fault detection system that can adapt to varying conditions and system changes is critical to the performance, safety, and reliability of these systems. This dissertation presents a novel fault detection approach based on the integration of the artificial immune system (AIS) paradigm and Online Support Vector Machines (OSVM). Together, these algorithms create the Artificial Immune System augemented Online Support Vector Machine (AISOSVM). The AISOSVM framework combines the strengths of the AIS and OSVM to create a fault detection system that can effectively identify faults in complex systems while maintaining adaptability. The framework is designed using Model-Based Systems Engineering (MBSE) principles, employing the Capella tool and the Arcadia methodology to develop a structured, integrated approach for the design and deployment of the data-driven fault detection system. A key contribution of this research is the development of a Clonal Selection Algorithm that optimizes the OSVM hyperparameters and the V-Detector algorithm parameters, resulting in a more effective fault detection solution. The integration of the AIS in the training process enables the generation of synthetic abnormal data, mitigating the need for engineers to gather large amounts of failure data, which can be impractical. The AISOSVM also incorporates incremental learning and decremental unlearning for the Online Support Vector Machine, allowing the system to adapt online using lightweight computational processes. This capability significantly improves the efficiency of fault detection systems, eliminating the need for offline retraining and redeployment. Reinforcement Learning (RL) is proposed as a promising future direction for the AISOSVM, as it can help autonomously adapt the system performance in near real-time, further mitigating the need for acquiring large amounts of system data for training, and improving the efficiency of the adaptation process by intelligently selecting the best samples to learn from. The AISOSVM framework was applied to real-world scenarios and platform models, demonstrating its effectiveness and adaptability in various use cases. The combination of the AIS and OSVM, along with the online learning and RL integration, provides a robust and adaptive solution for fault detection and health management in complex autonomous systems. This dissertation presents a significant contribution to the field of fault detection and health management by integrating the artificial immune system paradigm with Online Support Vector Machines, developing a structured, integrated approach for designing and deploying data-driven fault detection systems, and implementing reinforcement learning for online, autonomous adaptation of fault management systems. The AISOSVM framework offers a promising solution to address the challenges of fault detection in complex, autonomous systems, with potential applications in a wide range of industries beyond aerospace

    Development and application of deep learning and spatial statistics within 3D bone marrow imaging

    The bone marrow is a highly specialised organ, responsible for the formation of blood cells. Despite 50 years of research, the spatial organisation of the bone marrow remains an area full of controversy and contradiction. One reason for this is that imaging of bone marrow tissue is notoriously difficult. Secondly, efficient methodologies to fully extract and analyse large datasets remain the Achilles heels of imaging-based research. In this thesis I present a pipeline for generating 3D bone marrow images followed by the large-scale data extraction and spatial statistical analysis of the resulting data. Using these techniques, in the context of 3D imaging, I am able to identify and classify the location of hundreds of thousands of cells within various bone marrow samples. I then introduce a series of statistical techniques tailored to work with spatial data, resulting in a 3D statistical map of the tissue from which multi-cellular interactions can be clearly understood. As an illustration of the power of this new approach, I apply this pipeline to diseased samples of bone marrow with a particular focus on leukaemia and its interactions with CD8+ T cells. In so doing I show that this novel pipeline can be used to unravel complex multi-cellular interactions and assist researchers in understanding the processes taking place within the bone marrow.Open Acces

    Прогнозування за допомогою імунних алгоритмів

    Дана дипломна робота містить 111 с., 6 табл., 24 рис., 2 дод., 42 джерел. Тема: Прогнозування за допомогою імунних алгоритмів. У роботі розв’язується задача прогнозування часових рядів за допомогою імунних алгоритмів. Об’єкт дослідження: сучасні методи прогнозування часових рядів за допомогою імунних алгоритмів. Предмет дослідження: засоби моделювання і прогнозування з застосуванням імунних алгоритмів. Мета роботи: дослідити наявні імунні моделі для розв’язання задачі прогнозування часових рядів. Методи дослідження: використано математичний апарат імунних алгоритмів для прогнозування.This thesis contains 111 p., 6 tabl., 24 fig., 2 appendice, 42 sources. Theme: Forecasting using immune algorithms. The problem of predicting time series using immune algorithms is solved in the work. Object of research: modern methods of forecasting time series using immune algorithms. Subject of research: means of modeling and forecasting with the use of immune algorithms. Objective: To investigate existing immune models to solve the problem of time series prediction. Research methods: used mathematical apparatus of immune algorithms for prediction

    Artificial immune system for the Internet

    We investigate the usability of the Artificial Immune Systems (AIS) approach for solving selected problems in computer networks. Artificial immune systems are created by using the concepts and algorithms inspired by the theory of how the Human Immune System (HIS) works. We consider two applications: detection of routing misbehavior in mobile ad hoc networks, and email spam filtering. In mobile ad hoc networks the multi-hop connectivity is provided by the collaboration of independent nodes. The nodes follow a common protocol in order to build their routing tables and forward the packets of other nodes. As there is no central control, some nodes may defect to follow the common protocol, which would have a negative impact on the overall connectivity in the network. We build an AIS for the detection of routing misbehavior by directly mapping the standard concepts and algorithms used for explaining how the HIS works. The implementation and evaluation in a simulator shows that the AIS mimics well most of the effects observed in the HIS, e.g. the faster secondary reaction to the already encountered misbehavior. However, its effectiveness and practical usability are very constrained, because some particularities of the problem cannot be accounted for by the approach, and because of the computational constrains (reported also in AIS literature) of the used negative selection algorithm. For the spam filtering problem, we apply the AIS concepts and algorithms much more selectively and in a less standard way, and we obtain much better results. We build the AIS for antispam on top of a standard technique for digest-based collaborative email spam filtering. We notice un advantageous and underemphasized technological difference between AISs and the HIS, and we exploit this difference to incorporate the negative selection in an innovative and computationally efficient way. We also improve the representation of the email digests used by the standard collaborative spam filtering scheme. We show that this new representation and the negative selection, when used together, improve significantly the filtering performance of the standard scheme on top of which we build our AIS. Our complete AIS for antispam integrates various innate and adaptive AIS mechanisms, including the mentioned specific use of the negative selection and the use of innate signalling mechanisms (PAMP and danger signals). In this way the AIS takes into account users' profiles, implicit or explicit feedback from the users, and the bulkiness of spam. We show by simulations that the overall AIS is very good both in detecting spam and in avoiding misdetection of good emails. Interestingly, both the innate and adaptive mechanisms prove to be crucial for achieving the good overall performance. We develop and test (within a simulator) our AIS for collaborative spam filtering in the case of email communications. The solution however seems to be well applicable to other types of Internet communications: Internet telephony, chat/sms, forum, news, blog, or web. In all these cases, the aim is to allow the wanted communications (content) and prevent those unwanted from reaching the end users and occupying their time and communication resources. The filtering problems, faced or likely to be faced in the near future by these applications, have or are likely to have the settings similar to those that we have in the email case: need for openness to unknown senders (creators of content, initiators of the communication), bulkiness in receiving spam (many recipients are usually affected by the same spam content), tolerance of the system to a small damage (to small amounts of unfiltered spam), possibility to implicitly or explicitly and in a cheap way obtain a feedback from the recipients about the damage (about spam that they receive), need for strong tolerance to wanted (non-spam) content. Our experiments with the email spam filtering show that our AIS, i.e. the way how we build it, is well fitted to such problem settings

    Adaptive Search and Constraint Optimisation in Engineering Design

    The dissertation presents the investigation and development of novel adaptive computational techniques that provide a high level of performance when searching complex high-dimensional design spaces characterised by heavy non-linear constraint requirements. The objective is to develop a set of adaptive search engines that will allow the successful negotiation of such spaces to provide the design engineer with feasible high performance solutions. Constraint optimisation currently presents a major problem to the engineering designer and many attempts to utilise adaptive search techniques whilst overcoming these problems are in evidence. The most widely used method (which is also the most general) is to incorporate the constraints in the objective function and then use methods for unconstrained search. The engineer must develop and adjust an appropriate penalty function. There is no general solution to this problem neither in classical numerical optimisation nor in evolutionary computation. Some recent theoretical evidence suggests that the problem can only be solved by incorporating a priori knowledge into the search engine. Therefore, it becomes obvious that there is a need to classify constrained optimisation problems according to the degree of available or utilised knowledge and to develop search techniques applicable at each stage. The contribution of this thesis is to provide such a view of constrained optimisation, starting from problems that handle the constraints on the representation level, going through problems that have explicitly defined constraints (i.e., an easily computed closed form like a solvable equation), and ending with heavily constrained problems with implicitly defined constraints (incorporated into a single simulation model). At each stage we develop applicable adaptive search techniques that optimally exploit the degree of available a priori knowledge thus providing excellent quality of results and high performance. The proposed techniques are tested using both well known test beds and real world engineering design problems provided by industry.British Aerospace, Rolls Royce and Associate

    Integrated Immunity-based Methodology for UAV Monitoring and Control

    A general integrated and comprehensive health management framework based on the artificial immune system (AIS) paradigm is formulated and an automated system is developed and tested through simulation for the detection, identification, evaluation, and accommodation (DIEA) of abnormal conditions (ACs) on an unmanned aerial vehicle (UAV). The proposed methodology involves the establishment of a body of data to represent the function of the vehicle under nominal conditions, called the self, and differentiating this operation from that of the vehicle under an abnormal condition, referred to as the non-self. Data collected from simulations of the selected UAV autonomously flying a set of prescribed trajectories were used to develop and test novel schemes that are capable of addressing the AC-DIEA of sensor and actuator faults on a UAV. While the specific dynamic system used here is a UAV, the proposed framework and methodology is general enough to be adapted and applied to any complex dynamic system. The ACs considered within this effort included aerodynamic control surface locks and damage and angular rate sensor biases. The general framework for the comprehensive health management system comprises a novel complete integration of the AC-DIEA process with focus on the transition between the four different phases. The hierarchical multiself (HMS) strategy is used in conjunction with several biomimetic mechanisms to address the various steps in each phase. The partition of the universe approach is used as the basis of the AIS generation and the binary detection phase. The HMS approach is augmented by a mechanism inspired by the antigen presenting cells of the adaptive immune system for performing AC identification. The evaluation and accommodation phases are the most challenging phases of the AC-DIEA process due to the complexity and diversity of the ACs and the multidimensionality of the AIS. Therefore, the evaluation phase is divided into three separate steps: the qualitative evaluation, direct quantitative evaluation, and the indirect quantitative evaluation, where the type, severity, and effects of the AC are determined, respectively. The integration of the accommodation phase is based on a modular process, namely the strategic decision making, tactical decision marking, and execution modules. These modules are designed by the testing of several approaches for integrating the accommodation phase, which are specialized based on the type of AC being addressed. These approaches include redefining of the mission, adjustment or shifting of the control laws, or adjusting the sensor outputs. Adjustments of the mission include redefining of the trajectory to remove maneuvers which are no longer possible, while adjusting of the control laws includes modifying gains involved in determination of commanded control surface deflections. Analysis of the transition between phases includes a discussion of results for integrated example cases where the proposed AC-DIEA process is applied. The cases considered show the validity of the integrated AC-DIEA system and specific accommodation approaches by an improvement in flight performance through metrics that capture trajectory tracking errors and control activity differences between nominal, abnormal, and accommodated cases