36 research outputs found
STORM - a novel information fusion and cluster interpretation technique
Analysis of data without labels is commonly subject to scrutiny by unsupervised machine learning techniques. Such techniques provide more meaningful representations, useful for better understanding of a problem at hand, than by looking only at the data itself. Although abundant expert knowledge exists in many areas where unlabelled data is
examined, such knowledge is rarely incorporated into automatic analysis. Incorporation of expert knowledge is frequently a matter of combining multiple data sources from disparate hypothetical spaces. In cases where such spaces belong to different data types, this task becomes even more
challenging. In this paper we present a novel immune-inspired method that enables the fusion of such disparate types of data for a specific set of problems. We show that our method provides a better visual understanding of one hypothetical space with the help of data from another
hypothetical space. We believe that our model has implications for the field of exploratory data analysis and knowledge discovery
ToLeRating UR-STD
A new emerging paradigm of Uncertain Risk of Suspicion, Threat and Danger, observed across the field of information security, is described. Based on this paradigm a novel approach to anomaly detection is presented. Our approach is based on a simple yet powerful analogy from
the innate part of the human immune system, the Toll-Like Receptors. We argue that such receptors incorporated as part of an anomaly detector enhance the detector’s ability to distinguish normal and anomalous behaviour. In addition we propose that Toll-Like Receptors enable the classification of detected anomalies based on the types of attacks that perpetrate the anomalous behaviour. Classification of such type is either missing in existing literature or is not fit for the purpose of reducing the burden of an administrator of an intrusion detection system. For our model to work, we propose the creation of a taxonomy of the digital Acytota, based on which our receptors are created
Wavelet feature extraction and genetic algorithm for biomarker detection in colorectal cancer data
Biomarkers which predict patient’s survival can play an important role in medical diagnosis and
treatment. How to select the significant biomarkers from hundreds of protein markers is a key step in
survival analysis. In this paper a novel method is proposed to detect the prognostic biomarkers ofsurvival in colorectal cancer patients using wavelet analysis, genetic algorithm, and Bayes classifier. One dimensional discrete wavelet transform (DWT) is normally used to reduce the dimensionality of biomedical data. In this study one dimensional continuous wavelet transform (CWT) was proposed to extract the features of colorectal cancer data. One dimensional CWT has no ability to reduce
dimensionality of data, but captures the missing features of DWT, and is complementary part of DWT. Genetic algorithm was performed on extracted wavelet coefficients to select the optimized features, using Bayes classifier to build its fitness function. The corresponding protein markers were
located based on the position of optimized features. Kaplan-Meier curve and Cox regression model 2 were used to evaluate the performance of selected biomarkers. Experiments were conducted on colorectal cancer dataset and several significant biomarkers were detected. A new protein biomarker CD46 was found to significantly associate with survival time
A Simple Guard for Learned Optimizers
If the trend of learned components eventually outperforming their
hand-crafted version continues, learned optimizers will eventually outperform
hand-crafted optimizers like SGD or Adam. Even if learned optimizers (L2Os)
eventually outpace hand-crafted ones in practice however, they are still not
provably convergent and might fail out of distribution. These are the questions
addressed here. Currently, learned optimizers frequently outperform generic
hand-crafted optimizers (such as gradient descent) at the beginning of learning
but they generally plateau after some time while the generic algorithms
continue to make progress and often overtake the learned algorithm as Aesop's
tortoise which overtakes the hare. L2Os also still have a difficult time
generalizing out of distribution. Heaton et al. proposed Safeguarded L2O (GL2O)
which can take a learned optimizer and safeguard it with a generic learning
algorithm so that by conditionally switching between the two, the resulting
algorithm is provably convergent. We propose a new class of Safeguarded L2O,
called Loss-Guarded L2O (LGL2O), which is both conceptually simpler and
computationally less expensive. The guarding mechanism decides solely based on
the expected future loss value of both optimizers. Furthermore, we show
theoretical proof of LGL2O's convergence guarantee and empirical results
comparing to GL2O and other baselines showing that it combines the best of both
L2O and SGD and that in practice converges much better than GL2O.Comment: 8 pages main article, 19 figures total, 2 pages of references, 7
pages of appendix. ICML 2022. Added Appendix Section H with extra experiments
about stabilit
The DCA:SOMe comparison: a comparative study between two biologically-inspired algorithms
The dendritic cell algorithm (DCA) is an immune-inspired algorithm, developed for the purpose of anomaly detection. The algorithm performs multi-sensor data fusion and correlation which results in a ‘context aware’ detection system. Previous applications of the DCA have included the detection of potentially malicious port scanning activity, where it has produced high rates of true positives and low rates of false positives. In this work we aim to compare the performance of the DCA and of a self-organizing map (SOM) when applied to the detection of SYN port scans, through experimental analysis. A SOM is an ideal candidate for comparison as it shares similarities with the DCA in terms of the data fusion method employed. It is shown that the results of the two systems are comparable, and both produce false positives for the same processes. This shows that the DCA can produce anomaly detection results to the same standard as an established technique