3,909 research outputs found

    Multivariate correlation analysis technique based on Euclidean distance map for network traffic characterization

    Get PDF
    The quality of feature has significant impact on the performance of detection techniques used for Denial-of-Service (DoS) attack. The features that fail to provide accurate characterization for network traffic records make the techniques suffer from low accuracy in detection. Although researches have been conducted and attempted to overcome this problem, there are some constraints in these works. In this paper, we propose a technique based on Euclidean Distance Map (EDM) for optimal feature extraction. The proposed technique runs analysis on original feature space (first-order statistics) and extracts the multivariate correlations between the first-order statistics. The extracted multivariate correlations, namely second-order statistics, preserve significant discriminative information for accurate characterizations of network traffic records, and these multivariate correlations can be the high-quality potential features for DoS attack detection. The effectiveness of the proposed technique is evaluated using KDD CUP 99 dataset and experimental analysis shows encouraging results. © 2011 Springer-Verlag

    An intrusion detection system based on polynomial feature correlation analysis

    Full text link
    © 2017 IEEE. This paper proposes an anomaly-based Intrusion Detection System (IDS), which flags anomalous network traffic with a distance-based classifier. A polynomial approach was designed and applied in this work to extract hidden correlations from traffic related statistics in order to provide distinguishing features for detection. The proposed IDS was evaluated using the well-known KDD Cup 99 data set. Evaluation results show that the proposed system achieved better detection rates on KDD Cup 99 data set in comparison with another two state-of-the-art detection schemes. Moreover, the computational complexity of the system has been analysed in this paper and shows similar to the two state-of-the-art schemes

    Data Imputation through the Identification of Local Anomalies

    Get PDF
    We introduce a comprehensive and statistical framework in a model free setting for a complete treatment of localized data corruptions due to severe noise sources, e.g., an occluder in the case of a visual recording. Within this framework, we propose i) a novel algorithm to efficiently separate, i.e., detect and localize, possible corruptions from a given suspicious data instance and ii) a Maximum A Posteriori (MAP) estimator to impute the corrupted data. As a generalization to Euclidean distance, we also propose a novel distance measure, which is based on the ranked deviations among the data attributes and empirically shown to be superior in separating the corruptions. Our algorithm first splits the suspicious instance into parts through a binary partitioning tree in the space of data attributes and iteratively tests those parts to detect local anomalies using the nominal statistics extracted from an uncorrupted (clean) reference data set. Once each part is labeled as anomalous vs normal, the corresponding binary patterns over this tree that characterize corruptions are identified and the affected attributes are imputed. Under a certain conditional independency structure assumed for the binary patterns, we analytically show that the false alarm rate of the introduced algorithm in detecting the corruptions is independent of the data and can be directly set without any parameter tuning. The proposed framework is tested over several well-known machine learning data sets with synthetically generated corruptions; and experimentally shown to produce remarkable improvements in terms of classification purposes with strong corruption separation capabilities. Our experiments also indicate that the proposed algorithms outperform the typical approaches and are robust to varying training phase conditions

    Recurrence networks - A novel paradigm for nonlinear time series analysis

    Get PDF
    This paper presents a new approach for analysing structural properties of time series from complex systems. Starting from the concept of recurrences in phase space, the recurrence matrix of a time series is interpreted as the adjacency matrix of an associated complex network which links different points in time if the evolution of the considered states is very similar. A critical comparison of these recurrence networks with similar existing techniques is presented, revealing strong conceptual benefits of the new approach which can be considered as a unifying framework for transforming time series into complex networks that also includes other methods as special cases. It is demonstrated that there are fundamental relationships between the topological properties of recurrence networks and the statistical properties of the phase space density of the underlying dynamical system. Hence, the network description yields new quantitative characteristics of the dynamical complexity of a time series, which substantially complement existing measures of recurrence quantification analysis

    APPLYING MULTIVARIATE GEOSTATISTICS FOR TRANSIT RIDERSHIP MODELING AT THE BUS STOP LEVEL

    Get PDF
    Travel demand models have been developed and refined over the years to consider a characteristic normally found in travel data: spatial autocorrelation. Another important feature of travel demand data is its multivariate nature. However, regarding the public transportation demand, there is a lack of multivariate spatial models that consider the scarce nature of travel data, which generally are expensive to collect, and also need an appropriate level of detail. Thus, the main aim of this study was to estimate the Boarding variable along a bus line from the city of São Paulo - Brazil, by means of a multivariate geostatistical modeling at the bus stop level. As specific objectives, a comparative analysis conducted by applying Universal Kriging, Ordinary Kriging and Ordinary Least Squares Regression for the same travel demand variable was proposed. From goodness-of-fit measures, the results indicated that Geostatistics is a competitive tool comparing to classical modeling, emphasizing the multivariate interpolator Universal Kriging. Therefore, three main contributions can be highlighted: (1) the methodological advance of using a multivariate geostatistical approach, at the bus stop level, on public transportation demand modeling; (2) the benefits provided by the models regarding the land use and bus network planning; and (3) resource savings of field surveys for collecting travel data

    Bayesian statistical analysis of ground-clutter for the relative calibration of dual polarization weather radars

    Get PDF
    A new data processing methodology, based on the statistical analysis of ground-clutter echoes and aimed at investigating the stability of the weather radar relative calibration, is presented. A Bayesian classification scheme has been used to identify meteorological and/or ground-clutter echoes. The outcome is evaluated on a training dataset using statistical score indexes through the comparison with a deterministic clutter map. After discriminating the ground clutter areas, we have focused on the spatial analysis of robust and stable returns by using an automated region-merging algorithm. The temporal series of the ground-clutter statistical parameters, extracted from the spatial analysis and expressed in terms of percentile and mean values, have been used to estimate the relative clutter calibration and its uncertainty for both co-polar and differential reflectivity. The proposed methodology has been applied to a dataset collected by a C-band weather radar in southern Italy

    Time is of the Essence: Machine Learning-based Intrusion Detection in Industrial Time Series Data

    Full text link
    The Industrial Internet of Things drastically increases connectivity of devices in industrial applications. In addition to the benefits in efficiency, scalability and ease of use, this creates novel attack surfaces. Historically, industrial networks and protocols do not contain means of security, such as authentication and encryption, that are made necessary by this development. Thus, industrial IT-security is needed. In this work, emulated industrial network data is transformed into a time series and analysed with three different algorithms. The data contains labeled attacks, so the performance can be evaluated. Matrix Profiles perform well with almost no parameterisation needed. Seasonal Autoregressive Integrated Moving Average performs well in the presence of noise, requiring parameterisation effort. Long Short Term Memory-based neural networks perform mediocre while requiring a high training- and parameterisation effort.Comment: Extended version of a publication in the 2018 IEEE International Conference on Data Mining Workshops (ICDMW
    • …
    corecore