156,153 research outputs found

    Bayesian Dynamic Modeling and Monitoring of Network Flows

    Full text link
    In the context of a motivating study of dynamic network flow data on a large-scale e-commerce web site, we develop Bayesian models for on-line/sequential analysis for monitoring and adapting to changes reflected in node-node traffic. For large-scale networks, we customize core Bayesian time series analysis methods using dynamic generalized linear models (DGLMs). These are integrated into the context of multivariate networks using the concept of decouple/recouple that was recently introduced in multivariate time series. This method enables flexible dynamic modeling of flows on large-scale networks and exploitation of partial parallelization of analysis while maintaining coherence with an over-arching multivariate dynamic flow model. This approach is anchored in a case-study on internet data, with flows of visitors to a commercial news web site defining a long time series of node-node counts on over 56,000 node pairs. Central questions include characterizing inherent stochasticity in traffic patterns, understanding node-node interactions, adapting to dynamic changes in flows and allowing for sensitive monitoring to flag anomalies. The methodology of dynamic network DGLMs applies to many dynamic network flow studies.Comment: 34 pages, 24 figure

    Packet-loss prediction model based on historical symbolic time-series forecasting

    Full text link
    University of Technology, Sydney. Faculty of Engineering and Information Technology.Rapid growth of Internet users and services has prompted researchers to contemplate smart models of supporting applications with the required Quality of Service (QoS). By prioritising Internet traffic and the core network more efficiently, QoS and Traffic Engineering (TE) functions can address performance issues related to emerging Internet applications. Consequently, software agents are expected to become key tools for the development of future software in distributed telecommunication environments. A major problem with the current routing mechanisms is that they generate routing tables that do not reflect the real-time state of the network and ignore factors like local congestion. The uncertainty in making routing decisions may be reduced by using information extracted from the knowledge base for packet transmissions. Many parameters have an impact on routing decision-making such as link transmission rate, data throughput, number of hops between two communicating peer end nodes, and time of day. There are also other certain performance parameters like delay, jitter and packet-loss, which are decision factors for online QoS traffic routing. The work of this thesis addresses the issue of defining a Data Mining (DM) model for packet switching in the communications network. In particular, the focus is on decision-making for smart routing management, which is based on the knowledge provided by DM informed agents. The main idea behind this work and related research projects is that time-series of network performance parameters, with periodical patterns, can be used as anomaly and failure detectors in the network. This project finds frequent patterns on delay and jitter time-series, which are useful in real-time packet-loss predictions. The thesis proposes two models for approximation of delay and jitter time-series, and prediction of packet-loss time-series – namely the Historical Symbolic Delay Approximation Model (HDAX) and the Data Mining Model for Smart Routing in Communications Networks (NARGES). The models are evaluated using two kinds of datasets. The datasets for the experiments are generated using: (i) the Distributed Internet Traffic Generator (D-ITG) and (ii) the OPNET Modeller (OPNET) datasets. HDAX forecasting module approximates current delay and jitter values based on the previous values and trends of the corresponding delay and jitter time-series. The prediction module, a Multilayer Perceptron (MLP), within the NARGES model uses the inputs obtained from HDAX. That is, the HDAX forecasted delay and jitter values are used by NARGES to estimate the future packet-loss value. The contributions of this thesis are (i) a real time Data Mining (DM) model called HDAX; (ii) a hybrid DM model called NARGES; (iii) model evaluation with D-ITG datasets; and (iv) model evaluation with OPNET datasets. In terms of the model results, NARGES and HDAX are evaluated with offline heterogeneous QoS traces. The results are compared to Autoregressive Moving Average (ARMA) model. HDAX model shows better speed and accuracy compared to ARMA and its forecasts are more correlated with target values than ARMA. NARGES demonstrates better correlation with target values than ARMA and more accuracy of the results, but it is slower than ARMA

    Users' traffic on two-sided Internet platforms. Qualitative dynamics

    Full text link
    Internet platforms' traffic defines important characteristics of platforms, such as pricing of services, advertisements, speed of operations. One can estimate the traffic with the traditional time series models like ARIMA, Holt-Winters, functional and kernel regressions. When using these methods, we usually smooth-out noise and various external effects in the data and obtain short-term predictions of processes. However, these models do not necessarily help us to understand the underlying mechanism and the tendencies of the processes. In this article, we discuss the dynamical system approach to the modeling, which is designed to discover the underlying mechanism and the qualitative properties of the system's phase portrait. We show how to reconstruct the governing differential equations from data. The external effects are modeled as system's parameters (initial conditions). Utilizing this new approach, we construct the models for the volume of users, interacting through Internet platforms, such as "Amazon.com", "Homes.mil" or "Wikipedia.org". Then, we perform qualitative analysis of the system's phase portrait and discuss the main characteristics of the platforms

    Dependent SiZer: Goodness-of-Fit Tests for Time Series Models

    Get PDF
    In this paper, we extend SiZer (SIgnificant ZERo crossing of the derivatives) to dependent data for the purpose of goodness of fit tests for time series models. Dependent SiZer compares the observed data with a specific null model being tested by adjusting the statistical inference using an assumed autocovariance function. This new approach uses a SiZer type visualization to flag statistically significant differences between the data and a given null model. The power of this approach is demonstrated through some examples of time series of Internet traffic data. It is seen that such time series can have even more burstiness than is predicted by the popular, long range dependent, Fractional Gaussian Noise model

    Performance Comparison of Four New ARIMA-ANN Prediction Models on Internet Traffic Data, Journal of Telecommunications and Information Technology, 2015, nr 1

    Get PDF
    Prediction of Internet traffic time series data (TSD) is a challenging research problem, owing to the complicated nature of TSD. In literature, many hybrids of auto-regressive integrated moving average (ARIMA) and artificial neural networks (ANN) models are devised for the TSD prediction. These hybrid models consider such TSD as a combination of linear and non-linear components, apply combination of ARIMA and ANN in some manner, to obtain the predictions. Out of the many available hybrid ARIMA-ANN models, this paper investigates as to which of them suits better for Internet traffic data. This suitability of hybrid ARIMA-ANN models is studied for both one-step ahead and multi-step ahead prediction cases. For the purpose of the study, Internet traffic data is sampled at every 30 and 60 minutes. Model performances are evaluated using the mean absolute error and mean square error measurement. For one-step ahead prediction, with a forecast horizon of 10 points and for three-step prediction, with a forecast horizon of 12 points, the moving average filter based hybrid ARIMA-ANN model gave better forecast accuracy than the other compared models

    Machine learning based anomaly detection in release testing of 5g mobile networks

    Get PDF
    Abstract. The need of high-quality phone and internet connections, high-speed streaming ability and reliable traffic with no interruptions has increased because of the advancements the wireless communication world witnessed since the start of 5G (fifth generation) networks. The amount of data generated, not just every day but also, every second made most of the traditional approaches or statistical methods used previously for data manipulation and modeling inefficient and unscalable. Machine learning (ML) and especially, the deep learning (DL)-based models achieve the state-of-art results because of their ability to recognize complex patterns that even human experts are not able to recognize. Machine learning-based anomaly detection is one of the current hot topics in both research and industry because of its practical applications in almost all domains. Anomaly detection is mainly used for two purposes. The first purpose is to understand why this anomalous behavior happens and as a result, try to prevent it from happening by solving the root cause of the problem. The other purpose is to, as well, understand why this anomalous behavior happens and try to be ready for dealing with this behavior as it would be predictable behavior in that case, such as the increased traffic through the weekends or some specific hours of the day. In this work, we apply anomaly detection on a univariate time series target, the block error rate (BLER). We experiment with different statistical approaches, classic supervised machine learning models, unsupervised machine learning models, and deep learning models and benchmark the final results. The main goal is to select the best model that achieves the balance of the best performance and less resources and apply it in a multivariate time series context where we are able to test the relationship between the different time series features and their influence on each other. Through the final phase, the model selected will be used, integrated, and deployed as part of an automatic system that detects and flags anomalies in real-time. The simple proposed deep learning model outperforms the other models in terms of the accuracy related metrics. We also emphasize the acceptable performance of the statistical approach that enters the competition of the best model due to its low training time and required computational resources

    A Comparative Study of Path Performance Metrics Predictors

    Full text link
    peer reviewedUsing quality-of-service (QoS) metrics for Internet traffic is expected to improve greatly the performance of many network enabled applications, such as Voice-over-IP (VoIP) and video conferencing. However, it is not possible to constantly measure path performance metrics (PPMs) such as delay and throughput without interfering with the network. In this work, we focus on PPMs measurement scalability by considering machine learning techniques to estimate predictive models from past PPMs observations. Using real data collected from PlanetLab, we provide a comparison between three different predictors: AR(MA) models, Kalman filters and support vector machines (SVMs). Some predic- tors use delay and throughput jointly to take advantage of the possible relationship between PPMs, while other predictors consider PPMs individually. Our current results illustrate that the best performing model is an individual SVM specific to each time series. Overall, delay can be predicted with very good accuracy while accurate forecasting of throughput remains an open problem
    corecore