69,349 research outputs found

    Techniques to Improve Stable Distribution Modeling of Network Traffic

    Get PDF
    The stable distribution has been shown to more accurately model some aspects of network traffic than alternative distributions. In this work, we quantitatively examine aspects of the modeling performance of the stable distribution as envisioned in a statistical network cyber event detection system. We examine the flexibility and robustness of the stable distribution, extending previous work by comparing the performance of the stable distribution against alternatives using three different, public network traffic data sets with a mix of traffic rates and cyber events. After showing the stable distribution to be the overall most accurate for the examined scenarios, we use the Hellinger metric to investigate the ability of the stable distribution to reduce modeling error when using small data windows and counting periods. For the selected case and metric, the stable model is compared to a Gaussian model and is shown to produce the best overall fit as well as the best (or at worst, equivalent) fit for all counting periods. Additionally, the best stable fit occurs at a counting period that is five times shorter than the best Gaussian case. These results imply that the stable distribution can provide a more robust and accurate model than Gaussian-based alternatives in statistical network anomaly detection implementations while also facilitating faster system detection and response

    Traffic matrix estimation on a large IP backbone: a comparison on real data

    Get PDF
    This paper considers the problem of estimating the point-to-point traffic matrix in an operational IP backbone. Contrary to previous studies, that have used a partial traffic matrix or demands estimated from aggregated Netflow traces, we use a unique data set of complete traffic matrices from a global IP network measured over five-minute intervals. This allows us to do an accurate data analysis on the time-scale of typical link-load measurements and enables us to make a balanced evaluation of different traffic matrix estimation techniques. We describe the data collection infrastructure, present spatial and temporal demand distributions, investigate the stability of fan-out factors, and analyze the mean-variance relationships between demands. We perform a critical evaluation of existing and novel methods for traffic matrix estimation, including recursive fanout estimation, worst-case bounds, regularized estimation techniques, and methods that rely on mean-variance relationships. We discuss the weaknesses and strengths of the various methods, and highlight differences in the results for the European and American subnetworks

    More "normal" than normal: scaling distributions and complex systems

    Get PDF
    One feature of many naturally occurring or engineered complex systems is tremendous variability in event sizes. To account for it, the behavior of these systems is often described using power law relationships or scaling distributions, which tend to be viewed as "exotic" because of their unusual properties (e.g., infinite moments). An alternate view is based on mathematical, statistical, and data-analytic arguments and suggests that scaling distributions should be viewed as "more normal than normal". In support of this latter view that has been advocated by Mandelbrot for the last 40 years, we review in this paper some relevant results from probability theory and illustrate a powerful statistical approach for deciding whether the variability associated with observed event sizes is consistent with an underlying Gaussian-type (finite variance) or scaling-type (infinite variance) distribution. We contrast this approach with traditional model fitting techniques and discuss its implications for future modeling of complex systems

    Towards machine learning applied to time series based network traffic forecasting

    Get PDF
    This TFG will explore some specific use cases of the application of Machine Learning techniques to Software-Define Networks, in particular to overlay protocols such as LISP, VXLAN, etc.The aim of this project is to implement a network traffic forecasting model using time series and improve its performance with machine learning techniques, offering a better prediction based in outlier correction. This is a project developed in the Computer Architecture Department (DAC) at the Universitat Politècnica de Catalunya (UPC). Time Series modeling methodology is able to shape a trend and take care of any existing outlier, however it does not cover outlier impact on forecasting. In order to achieve more precision and better confidence intervals, the model combines outlier detection methodology and Artificial Neural Networks to quantify and predict outliers. A study is realized over external data to find out if there is an improvement and its effect on the predictions. Machine learning techniques as Artificial Neural Networks has proven to be an improvement of the current methodology to realize forecasting using Time Series modeling. Future work will be oriented to create an improved standard of this system focused on generalize the model.El objetivo de este proyecto es implementar un modelo de previsión de tráfico de red utilizando series temporales y mejorar su rendimiento con técnicas de aprendizaje automático, generando una mejor predicción basada en la corrección de valores atípicos. Se trata de un proyecto desarrollado en el Departamento de Arquitectura de Computadores (DAC) de la Universidad Politécnica de Cataluña (UPC). La metodología de modelado de series temporales es capaz de predecir una tendencia y hacerse cargo de cualquier valor atípico ya existente, sin embargo, no cubre el impacto de estos sobre la predicción. Con el fin de lograr una mayor precisión y mejores intervalos de confianza, el modelo combina la metodología de detección de valores atípicos y redes neuronales artificiales para cuantificar y predecir los atípicos. Un estudio se realiza sobre datos externos para averiguar si hay una mejora y su efecto sobre las predicciones. Las técnicas de aprendizaje automático, como redes neuronales artificiales, han demostrado ser una mejora de la metodología actual para realizar la predicción utilizando modelos de series de tiempo. El trabajo futuro se orientará para crear un mejor nivel de este sistema se centró en generalizar el modelo.L'objectiu d'aquest projecte és implementar un model de previsió de tràfic de xarxa utilitzant sèries temporals i millorar el seu rendiment amb tècniques d'aprenentatge automàtic, generant una millor predicció basada en la correcció de valors atípics. Es tracta d'un projecte desenvolupat al Departament d'Arquitectura de Computadors (DAC) de la Universitat Politècnica de Catalunya (UPC). La metodologia de modelatge de sèries temporals és capaç de predir una tendència i fer-se càrrec de qualsevol valor atípic ja existent, però, no cobreix l'impacte d'aquests sobre la predicció. Per tal d'aconseguir una major precisió i millors intervals de confiança, el model combina la metodologia de detecció de valors atípics i xarxes neuronals artificials per quantificar i predir els atípics. Un estudi es realitza sobre dades externes per esbrinar si hi ha una millora i el seu efecte sobre les prediccions. Les tècniques d'aprenentatge automàtic, com xarxes neuronals artificials, han demostrat ser una millora de la metodologia actual per a fer predicció utilitzant models de sèries de temps. El treball futur s'orientarà per crear un millor nivell d'aquest sistema es va centrar en generalitzar el model
    corecore