82 research outputs found

    Adaptive rule-based malware detection employing learning classifier systems

    Get PDF
    Efficient and accurate malware detection is increasingly becoming a necessity for society to operate. Existing malware detection systems have excellent performance in identifying known malware for which signatures are available, but poor performance in anomaly detection for zero day exploits for which signatures have not yet been made available or targeted attacks against a specific entity. The primary goal of this thesis is to provide evidence for the potential of learning classier systems to improve the accuracy of malware detection. A customized system based on a state-of-the-art learning classier system is presented for adaptive rule-based malware detection, which combines a rule-based expert system with evolutionary algorithm based reinforcement learning, thus creating a self-training adaptive malware detection system which dynamically evolves detection rules. This system is analyzed on a benchmark of malicious and non-malicious files. Experimental results show that the system can outperform C4.5, a well-known non-adaptive machine learning algorithm, under certain conditions. The results demonstrate the system\u27s ability to learn effective rules from repeated presentations of a tagged training set and show the degree of generalization achieved on an independent test set. This thesis is an extension and expansion of the work published in the Security, Trust, and Privacy for Software Applications workshop in COMPSAC 2011 - the 35th Annual IEEE Signature Conference on Computer Software and Applications --Abstract, page iii

    MILCS: A mutual information learning classifier system

    Get PDF
    This paper introduces a new variety of learning classifier system (LCS), called MILCS, which utilizes mutual information as fitness feedback. Unlike most LCSs, MILCS is specifically designed for supervised learning. MILCS's design draws on an analogy to the structural learning approach of cascade correlation networks. We present preliminary results, and contrast them to results from XCS. We discuss the explanatory power of the resulting rule sets, and introduce a new technique for visualizing explanatory power. Final comments include future directions for this research, including investigations in neural networks and other systems. Copyright 2007 ACM

    The XMM Cluster Survey: a new cluster candidate sample and detailed selection function

    Get PDF
    In this thesis we present the XCS DR3 cluster candidate list. This represents the first major update of the XMM Cluster Survey since 2005. The candidate list comprises of 1365 entries with more than 300 detected counts distributed over 229 deg2. We note that a larger area (523 deg2) is available for the study of X-ray point sources and that the new XCS point source sample has more than 130,000 entries. After redshift follow-up and X-ray spectral analysis, these 1365 clusters will comprise the largest homogeneous sample of medium to high redshift X-ray clusters ever compiled. The future science applications of the XCS DR3 clusters include the study of the evolution of X-ray scaling relations and a measurement of cosmological parameters. In support of these science applications, we also present in this thesis detailed selection functions for the XCS. These selection functions allow us to quantify the number of clusters we didn’t detect in our survey regions. We have taken two approaches to the determination of the selection function: the use of simple (circular & isothermal) β models and the use of ‘observations’ of synthetic clusters from the CLEF N-body simulation. The β model work has allowed us to explore how the selection function depends on key cluster parameters such as luminosity, temperature, redshift, core size and profile shape. We have further explored how the selection function depends on the underlying cosmological model and applied our results to XCS cosmology forecasting (Sahlen et al. 2009). The CLEF work has allowed us to explore more complex cluster properties, such as core temperature, core shape, substructure and ellipticity. In summary, the combination of the cluster catalogues and selection functions presented herein will facilitate field leading science applications for many years to come

    Controlled self-organisation using learning classifier systems

    Get PDF
    The complexity of technical systems increases, breakdowns occur quite often. The mission of organic computing is to tame these challenges by providing degrees of freedom for self-organised behaviour. To achieve these goals, new methods have to be developed. The proposed observer/controller architecture constitutes one way to achieve controlled self-organisation. To improve its design, multi-agent scenarios are investigated. Especially, learning using learning classifier systems is addressed

    Model-free reconstruction of neuronal network connectivity from calcium imaging signals

    Get PDF
    A systematic assessment of global neural network connectivity through direct electrophysiological assays has remained technically unfeasible even in dissociated neuronal cultures. We introduce an improved algorithmic approach based on Transfer Entropy to reconstruct approximations to network structural connectivities from network activity monitored through calcium fluorescence imaging. Based on information theory, our method requires no prior assumptions on the statistics of neuronal firing and neuronal connections. The performance of our algorithm is benchmarked on surrogate time-series of calcium fluorescence generated by the simulated dynamics of a network with known ground-truth topology. We find that the effective network topology revealed by Transfer Entropy depends qualitatively on the time-dependent dynamic state of the network (e.g., bursting or non-bursting). We thus demonstrate how conditioning with respect to the global mean activity improves the performance of our method. [...] Compared to other reconstruction strategies such as cross-correlation or Granger Causality methods, our method based on improved Transfer Entropy is remarkably more accurate. In particular, it provides a good reconstruction of the network clustering coefficient, allowing to discriminate between weakly or strongly clustered topologies, whereas on the other hand an approach based on cross-correlations would invariantly detect artificially high levels of clustering. Finally, we present the applicability of our method to real recordings of in vitro cortical cultures. We demonstrate that these networks are characterized by an elevated level of clustering compared to a random graph (although not extreme) and by a markedly non-local connectivity.Comment: 54 pages, 8 figures (+9 supplementary figures), 1 table; submitted for publicatio

    Contributions to comprehensible classification

    Get PDF
    xxx, 240 p.La tesis doctoral descrita en esta memoria ha contribuido a la mejora de dos tipos de algoritmos declasificación comprensibles: algoritmos de \'arboles de decisión consolidados y algoritmos de inducciónde reglas tipo PART.En cuanto a las contribuciones a la consolidación de algoritmos de árboles de decisión, se hapropuesto una nueva estrategia de remuestreo que ajusta el número de submuestras para permitir cambiarla distribución de clases en las submuestras sin perder información. Utilizando esta estrategia, la versiónconsolidada de C4.5 (CTC) obtiene mejores resultados que un amplio conjunto de algoritmoscomprensibles basados en algoritmos genéticos y clásicos. Tres nuevos algoritmos han sido consolidados:una variante de CHAID (CHAID*) y las versiones Probability Estimation Tree de C4.5 y CHAID* (C4.4y CHAIC). Todos los algoritmos consolidados obtienen mejores resultados que sus algoritmos de\'arboles de decisión base, con tres algoritmos consolidados clasificándose entre los cuatro mejores en unacomparativa. Finalmente, se ha analizado el efecto de la poda en algoritmos simples y consolidados de\'arboles de decisión, y se ha concluido que la estrategia de poda propuesta en esta tesis es la que obtiene mejores resultados.En cuanto a las contribuciones a algoritmos tipo PART de inducción de reglas, una primerapropuesta cambia varios aspectos de como PART genera \'arboles parciales y extrae reglas de estos, locual resulta en clasificadores con mejor capacidad de generalizar y menor complejidad estructuralcomparando con los generados por PART. Una segunda propuesta utiliza \'arboles completamentedesarrollados, en vez de parcialmente desarrollados, y genera conjuntos de reglas que obtienen aúnmejores resultados de clasificación y una complejidad estructural menor. Estas dos nuevas propuestas y elalgoritmo PART original han sido complementadas con variantes basadas en CHAID* para observar siestos beneficios pueden ser trasladados a otros algoritmos de \'arboles de decisión y se ha observado, dehecho, que los algoritmos tipo PART basados en CHAID* también crean clasificadores más simples ycon mejor capacidad de clasificar que CHAID

    Self-similar scaling and evolution in the galaxy cluster X-ray Luminosity-Temperature relation

    Full text link
    We investigate the form and evolution of the X-ray luminosity-temperature (LT) relation of a sample of 114 galaxy clusters observed with Chandra at 0.1<z<1.3. The clusters were divided into subsamples based on their X-ray morphology or whether they host strong cool cores. We find that when the core regions are excluded, the most relaxed clusters (or those with the strongest cool cores) follow an LT relation with a slope that agrees well with simple self-similar expectations. This is supported by an analysis of the gas density profiles of the systems, which shows self-similar behaviour of the gas profiles of the relaxed clusters outside the core regions. By comparing our data with clusters in the REXCESS sample, which extends to lower masses, we find evidence that the self-similar behaviour of even the most relaxed clusters breaks at around 3.5keV. By contrast, the LT slopes of the subsamples of unrelaxed systems (or those without strong cool cores) are significantly steeper than the self-similar model, with lower mass systems appearing less luminous and higher mass systems appearing more luminous than the self-similar relation. We argue that these results are consistent with a model of non-gravitational energy input in clusters that combines central heating with entropy enhancements from merger shocks. Such enhancements could extend the impact of central energy input to larger radii in unrelaxed clusters, as suggested by our data. We also examine the evolution of the LT relation, and find that while the data appear inconsistent with simple self-similar evolution, the differences can be plausibly explained by selection bias, and thus we find no reason to rule out self-similar evolution. We show that the fraction of cool core clusters in our (non-representative) sample decreases at z>0.5 and discuss the effect of this on measurements of the evolution in the LT relation.Comment: 21 pages, 15 figures. Submitted to MNRAS. Comments welcom

    Controlled self-organisation using learning classifier systems

    Get PDF
    The complexity of technical systems increases, breakdowns occur quite often. The mission of organic computing is to tame these challenges by providing degrees of freedom for self-organised behaviour. To achieve these goals, new methods have to be developed. The proposed observer/controller architecture constitutes one way to achieve controlled self-organisation. To improve its design, multi-agent scenarios are investigated. Especially, learning using learning classifier systems is addressed

    MINES: Mutual Information Neuro-Evolutionary System

    Get PDF
    Mutual information neuro-evolutionary system (MINES) presents a novel self-governing approach to determine the optimal quantity and connectivity of the hidden layer of a three layer feed-forward neural network founded on theoretical and practical basis. The system is a combination of a feed-forward neural network, back-propagation algorithm, genetic algorithm, mutual information and clustering. Back-propagation is used for parameter learning to reduce the system’s error; while mutual information aides back-propagation to follow an effective path in the weight space. A genetic algorithm changes the incoming synaptic connections of the hidden nodes, based on the fitness provided by the mutual information from the error space to the hidden layer, to perform structural learning. Mutual information determines the appropriate synapses, connecting the hidden nodes to the input layer; however, in effect it also links the back-propagation to the genetic algorithm. Weight clustering is applied to reduce hidden nodes having similar functionality; i.e. those possessing same connectivity patterns and close Euclidean angle in the weight space. Finally, the performance of the system is assessed on two theoretical and one empirical problems. A nonlinear polynomial regression problem and the well known two-spiral classification task are used to evaluate the theoretical performance of the system. Forecasting daily crude oil prices are conducted to observe the performance of MINES on a real world application
    corecore