5,030 research outputs found

    Discovering unbounded episodes in sequential data

    Get PDF
    One basic goal in the analysis of time-series data is to find frequent interesting episodes, i.e, collections of events occurring frequently together in the input sequence. Most widely-known work decide the interestingness of an episode from a fixed user-specified window width or interval, that bounds the subsequent sequential association rules. We present in this paper, a more intuitive definition that allows, in turn, interesting episodes to grow during the mining without any user-specified help. A convenient algorithm to efficiently discover the proposed unbounded episodes is also implemented. Experimental results confirm that our approach results useful and advantageous.Postprint (published version

    Dynamic Analysis of Executables to Detect and Characterize Malware

    Full text link
    It is needed to ensure the integrity of systems that process sensitive information and control many aspects of everyday life. We examine the use of machine learning algorithms to detect malware using the system calls generated by executables-alleviating attempts at obfuscation as the behavior is monitored rather than the bytes of an executable. We examine several machine learning techniques for detecting malware including random forests, deep learning techniques, and liquid state machines. The experiments examine the effects of concept drift on each algorithm to understand how well the algorithms generalize to novel malware samples by testing them on data that was collected after the training data. The results suggest that each of the examined machine learning algorithms is a viable solution to detect malware-achieving between 90% and 95% class-averaged accuracy (CAA). In real-world scenarios, the performance evaluation on an operational network may not match the performance achieved in training. Namely, the CAA may be about the same, but the values for precision and recall over the malware can change significantly. We structure experiments to highlight these caveats and offer insights into expected performance in operational environments. In addition, we use the induced models to gain a better understanding about what differentiates the malware samples from the goodware, which can further be used as a forensics tool to understand what the malware (or goodware) was doing to provide directions for investigation and remediation.Comment: 9 pages, 6 Tables, 4 Figure

    Hybrid self-organizing feature map (SOM) for anomaly detection in cloud infrastructures using granular clustering based upon value-difference metrics

    Get PDF
    We have witnessed an increase in the availability of data from diverse sources over the past few years. Cloud computing, big data and Internet-of-Things (IoT) are distinctive cases of such an increase which demand novel approaches for data analytics in order to process and analyze huge volumes of data for security and business use. Cloud computing has been becoming popular for critical structure IT mainly due to cost savings and dynamic scalability. Current offerings, however, are not mature enough with respect to stringent security and resilience requirements. Mechanisms such as anomaly detection hybrid systems are required in order to protect against various challenges that include network based attacks, performance issues and operational anomalies. Such hybrid AI systems include Neural Networks, blackboard systems, belief (Bayesian) networks, case-based reasoning and rule-based systems and can be implemented in a variety of ways. Traffic in the cloud comes from multiple heterogeneous domains and changes rapidly due to the variety of operational characteristics of the tenants using the cloud and the elasticity of the provided services. The underlying detection mechanisms rely upon measurements drawn from multiple sources. However, the characteristics of the distribution of measurements within specific subspaces might be unknown. We argue in this paper that there is a need to cluster the observed data during normal network operation into multiple subspaces each one of them featuring specific local attributes, i.e. granules of information. Clustering is implemented by the inference engine of a model hybrid NN system. Several variations of the so-called value-difference metric (VDM) are investigated like local histograms and the Canberra distance for scalar attributes, the Jaccard distance for binary word attributes, rough sets as well as local histograms over an aggregate ordering distance and the Canberra measure for vectorial attributes. Low-dimensional subspace representations of each group of points (measurements) in the context of anomaly detection in critical cloud implementations is based upon VD metrics and can be either parametric or non-parametric. A novel application of a Self-Organizing-Feature Map (SOFM) of reduced/aggregate ordered sets of objects featuring VD metrics (as obtained from distributed network measurements) is proposed. Each node of the SOFM stands for a structured local distribution of such objects within the input space. The so-called Neighborhood-based Outlier Factor (NOOF) is defined for such reduced/aggregate ordered sets of objects as a value-difference metric of histogrammes. Measurements that do not belong to local distributions are detected as anomalies, i.e. outliers of the trained SOFM. Several methods of subspace clustering using Expectation-Maximization Gaussian Mixture Models (a parametric approach) as well as local data densities (a non-parametric approach) are outlined and compared against the proposed method using data that are obtained from our cloud testbed in emulated anomalous traffic conditions. The results—which are obtained from a model NN system—indicate that the proposed method performs well in comparison with conventional techniques

    A Neural System for Automated CCTV Surveillance

    Get PDF
    This paper overviews a new system, the “Owens Tracker,” for automated identification of suspicious pedestrian activity in a car-park. Centralized CCTV systems relay multiple video streams to a central point for monitoring by an operator. The operator receives a continuous stream of information, mostly related to normal activity, making it difficult to maintain concentration at a sufficiently high level. While it is difficult to place quantitative boundaries on the number of scenes and time period over which effective monitoring can be performed, Wallace and Diffley [1] give some guidance, based on empirical and anecdotal evidence, suggesting that the number of cameras monitored by an operator be no greater than 16, and that the period of effective monitoring may be as low as 30 minutes before recuperation is required. An intelligent video surveillance system should therefore act as a filter, censuring inactive scenes and scenes showing normal activity. By presenting the operator only with unusual activity his/her attention is effectively focussed, and the ratio of cameras to operators can be increased. The Owens Tracker learns to recognize environmentspecific normal behaviour, and refers sequences of unusual behaviour for operator attention. The system was developed using standard low-resolution CCTV cameras operating in the car-parks of Doxford Park Industrial Estate (Sunderland, Tyne and Wear), and targets unusual pedestrian behaviour. The modus operandi of the system is to highlight excursions from a learned model of normal behaviour in the monitored scene. The system tracks objects and extracts their centroids; behaviour is defined as the trajectory traced by an object centroid; normality as the trajectories typically encountered in the scene. The essential stages in the system are: segmentation of objects of interest; disambiguation and tracking of multiple contacts, including the handling of occlusion and noise, and successful tracking of objects that “merge” during motion; identification of unusual trajectories. These three stages are discussed in more detail in the following sections, and the system performance is then evaluated

    A Review of Rule Learning Based Intrusion Detection Systems and Their Prospects in Smart Grids

    Get PDF
    • …
    corecore