58,279 research outputs found

    AI Solutions for MDS: Artificial Intelligence Techniques for Misuse Detection and Localisation in Telecommunication Environments

    Get PDF
    This report considers the application of Articial Intelligence (AI) techniques to the problem of misuse detection and misuse localisation within telecommunications environments. A broad survey of techniques is provided, that covers inter alia rule based systems, model-based systems, case based reasoning, pattern matching, clustering and feature extraction, articial neural networks, genetic algorithms, arti cial immune systems, agent based systems, data mining and a variety of hybrid approaches. The report then considers the central issue of event correlation, that is at the heart of many misuse detection and localisation systems. The notion of being able to infer misuse by the correlation of individual temporally distributed events within a multiple data stream environment is explored, and a range of techniques, covering model based approaches, `programmed' AI and machine learning paradigms. It is found that, in general, correlation is best achieved via rule based approaches, but that these suffer from a number of drawbacks, such as the difculty of developing and maintaining an appropriate knowledge base, and the lack of ability to generalise from known misuses to new unseen misuses. Two distinct approaches are evident. One attempts to encode knowledge of known misuses, typically within rules, and use this to screen events. This approach cannot generally detect misuses for which it has not been programmed, i.e. it is prone to issuing false negatives. The other attempts to `learn' the features of event patterns that constitute normal behaviour, and, by observing patterns that do not match expected behaviour, detect when a misuse has occurred. This approach is prone to issuing false positives, i.e. inferring misuse from innocent patterns of behaviour that the system was not trained to recognise. Contemporary approaches are seen to favour hybridisation, often combining detection or localisation mechanisms for both abnormal and normal behaviour, the former to capture known cases of misuse, the latter to capture unknown cases. In some systems, these mechanisms even work together to update each other to increase detection rates and lower false positive rates. It is concluded that hybridisation offers the most promising future direction, but that a rule or state based component is likely to remain, being the most natural approach to the correlation of complex events. The challenge, then, is to mitigate the weaknesses of canonical programmed systems such that learning, generalisation and adaptation are more readily facilitated

    Weka: A machine learning workbench for data mining

    Get PDF
    The Weka workbench is an organized collection of state-of-the-art machine learning algorithms and data preprocessing tools. The basic way of interacting with these methods is by invoking them from the command line. However, convenient interactive graphical user interfaces are provided for data exploration, for setting up large-scale experiments on distributed computing platforms, and for designing configurations for streamed data processing. These interfaces constitute an advanced environment for experimental data mining. The system is written in Java and distributed under the terms of the GNU General Public License

    Testing Interestingness Measures in Practice: A Large-Scale Analysis of Buying Patterns

    Full text link
    Understanding customer buying patterns is of great interest to the retail industry and has shown to benefit a wide variety of goals ranging from managing stocks to implementing loyalty programs. Association rule mining is a common technique for extracting correlations such as "people in the South of France buy ros\'e wine" or "customers who buy pat\'e also buy salted butter and sour bread." Unfortunately, sifting through a high number of buying patterns is not useful in practice, because of the predominance of popular products in the top rules. As a result, a number of "interestingness" measures (over 30) have been proposed to rank rules. However, there is no agreement on which measures are more appropriate for retail data. Moreover, since pattern mining algorithms output thousands of association rules for each product, the ability for an analyst to rely on ranking measures to identify the most interesting ones is crucial. In this paper, we develop CAPA (Comparative Analysis of PAtterns), a framework that provides analysts with the ability to compare the outcome of interestingness measures applied to buying patterns in the retail industry. We report on how we used CAPA to compare 34 measures applied to over 1,800 stores of Intermarch\'e, one of the largest food retailers in France
    corecore