194 research outputs found
Influence of gross regional and industrial product ranks on data call connections.
A thesis submitted to the Faculty of Engineering,
University of the '\Vitwatersrand, Johannesburg,
in fulfilment of the requirements for the degree
of
Doctor of PhilosophyTHIS STUDY identifies and evaluates factors that affect call connections in the South
African public data networks, modelling these factors to aid data network planning. The research
shows the relationship between the economic rank of each region served and the data
communication resources required for that region. Moreover, it shows the resources required
between regions.
THE THRUST of this thesis is that the volume of cans from a region can be estimated
from its economic ...k and more than 75% olthe variation in the volume of calls between regions
can be explained using the ranks of the originating and terminating regions. To prove this, records
of more than four million calls are accumulated for all regions of the South African packet
switched data network. An appropriate filtering and aggregation method is developed.
EXISTING growth models including the gravity model are separately examined. Based
on probability and dimensional arguments, the Bell System growth model is selected. It is
revealed that the success of this model depends on one premise being satisfied: this model tacitly
anti implicitly assumes that the originating and terminating calls are statistically independent.
RETURNING to the data network, it is found that the call connections (after filtering
and aggregation) display dependence of destination on origin. Reasons for the dependence are
discovered. Multiple linear regression reveals the nature of this dependence. Surprisingly,
distance is not a factor. The importance of regional ranks and an inter-regional indicator variable
are also discovered.
FINALL Y, call volume from a node is shown to be directly linked with the weighted
Gross Regional and Industrial Product of the region. This quantity, in tum, is inversely related
to the rank of the region. Call connections are then modelled to be equal to the call connections
within the first tanked region divided by the product of the originating region's rank and the
terminating region's rank. This simple and economical model explains 76% of the variations that
occur in call connections. It has proved its use by being included in the data transfer services
product-line report.Andrew Chakane 201
Modeling Tiered Pricing in the Internet Transit Market
ISPs are increasingly selling "tiered" contracts, which offer Internet
connectivity to wholesale customers in bundles, at rates based on the cost of
the links that the traffic in the bundle is traversing. Although providers have
already begun to implement and deploy tiered pricing contracts, little is known
about how such pricing affects ISPs and their customers. While contracts that
sell connectivity on finer granularities improve market efficiency, they are
also more costly for ISPs to implement and more difficult for customers to
understand. In this work we present two contributions: (1) we develop a novel
way of mapping traffic and topology data to a demand and cost model; and (2) we
fit this model on three large real-world networks: an European transit ISP, a
content distribution network, and an academic research network, and run
counterfactuals to evaluate the effects of different pricing strategies on both
the ISP profit and the consumer surplus. We highlight three core findings.
First, ISPs gain most of the profits with only three or four pricing tiers and
likely have little incentive to increase granularity of pricing even further.
Second, we show that consumer surplus follows closely, if not precisely, the
increases in ISP profit with more pricing tiers. Finally, the common ISP
practice of structuring tiered contracts according to the cost of carrying the
traffic flows (e.g., offering a discount for traffic that is local) can be
suboptimal and that dividing contracts based on both traffic demand and the
cost of carrying it into only three or four tiers yields near-optimal profit
for the ISP
AN INVESTIGATION INTO AN EXPERT SYSTEM FOR TELECOMMUNICATION NETWORK DESIGN
Many telephone companies, especially in Eastern-Europe and the 'third world', are
developing new telephone networks. In such situations the network design engineer needs
computer based tools that not only supplement his own knowledge but also help him to cope
with situations where not all the information necessary for the design is available. Often
traditional network design tools are somewhat removed from the practical world for which
they were developed. They often ignore the significant uncertain and statistical nature of the
input data. They use data taken from a fixed point in time to solve a time variable problem,
and the cost formulae tend to be on an average per line or port rather than the specific case.
Indeed, data is often not available or just plainly unreliable. The engineer has to rely on
rules of thumb honed over many years of experience in designing networks and be able to
cope with missing data.
The complexity of telecommunication networks and the rarity of specialists in this area often
makes the network design process very difficult for a company. It is therefore an important
area for the application of expert systems. Designs resulting from the use of expert systems
will have a measure of uncertainty in their solution and adequate account must be made of
the risk involved in implementing its design recommendations.
The thesis reviews the status of expert systems as used for telecommunication network
design. It further shows that such an expert system needs to reduce a large network problem
into its component parts, use different modules to solve them and then combine these results
to create a total solution. It shows how the various sub-division problems are integrated to
solve the general network design problem. This thesis further presents details of such an
expert system and the databases necessary for network design: three new algorithms are
invented for traffic analysis, node locations and network design and these produce results
that have close correlation with designs taken from BT Consultancy archives.
It was initially supposed that an efficient combination of existing techniques for dealing with uncertainty
within expert systems would suffice for the basis of the new system. It soon
became apparent, however, that to allow for the differing attributes of facts, rules and data
and the varying degrees of importance or rank within each area, a new and radically different
method would be needed.
Having investigated the existing uncertainty problem it is believed that a new more rational
method has been found. The work has involved the invention of the 'Uncertainty Window'
technique and its testing on various aspects of network design, including demand forecast,
network dimensioning, node and link system sizing, etc. using a selection of networks that
have been designed by BT Consultancy staff. From the results of the analysis, modifications
to the technique have been incorporated with the aim of optimising the heuristics and
procedures, so that the structure gives an accurate solution as early as possible.
The essence of the process is one of associating the uncertainty windows with their relevant
rules, data and facts, which results in providing the network designer with an insight into the
uncertainties that have helped produce the overall system design: it indicates which sources
of uncertainty and which assumptions are were critical for further investigation to improve
upon the confidence of the overall design. The windowing technique works by virtue of its
ability to retain the composition of the uncertainty and its associated values, assumption, etc.
and allows for better solutions to be attained.BRITISH TELECOMMUNICATIONS PL
More "normal" than normal: scaling distributions and complex systems
One feature of many naturally occurring or engineered complex systems is tremendous variability in event sizes. To account for it, the behavior of these systems is often described using power law relationships or scaling distributions, which tend to be viewed as "exotic" because of their unusual properties (e.g., infinite moments). An alternate view is based on mathematical, statistical, and data-analytic arguments and suggests that scaling distributions should be viewed as "more normal than normal". In support of this latter view that has been advocated by Mandelbrot for the last 40 years, we review in this paper some relevant results from probability theory and illustrate a powerful statistical approach for deciding whether the variability associated with observed event sizes is consistent with an underlying Gaussian-type (finite variance) or scaling-type (infinite variance) distribution. We contrast this approach with traditional model fitting techniques and discuss its implications for future modeling of complex systems
Machine Learning and Big Data Methodologies for Network Traffic Monitoring
Over the past 20 years, the Internet saw an exponential grown of traffic, users, services and applications. Currently, it is estimated that the Internet is used everyday by more than 3.6 billions users, who generate 20 TB of traffic per second. Such a huge amount of data challenge network managers and analysts to understand how the network is performing, how users are accessing resources, how to properly control and manage the infrastructure, and how to detect possible threats. Along with mathematical, statistical, and set theory methodologies machine learning and big data approaches have emerged to build systems that aim at automatically extracting information from the raw data that the network monitoring infrastructures offer.
In this thesis I will address different network monitoring solutions, evaluating several methodologies and scenarios. I will show how following a common workflow, it is possible to exploit mathematical, statistical, set theory, and machine learning methodologies to extract meaningful information from the raw data. Particular attention will be given to machine learning and big data methodologies such as DBSCAN, and the Apache Spark big data framework.
The results show that despite being able to take advantage of mathematical, statistical, and set theory tools to characterize a problem, machine learning methodologies are very useful to discover hidden information about the raw data. Using DBSCAN clustering algorithm, I will show how to use YouLighter, an unsupervised methodology to group caches serving YouTube traffic into edge-nodes, and latter by using the notion of Pattern Dissimilarity, how to identify changes in their usage over time. By using YouLighter over 10-month long races, I will pinpoint sudden changes in the YouTube edge-nodes usage, changes that also impair the end users’ Quality of Experience. I will also apply DBSCAN in the deployment of SeLINA, a self-tuning
tool implemented in the Apache Spark big data framework to autonomously extract knowledge from network traffic measurements. By using SeLINA, I will show how to automatically detect the changes of the YouTube CDN previously highlighted by YouLighter.
Along with these machine learning studies, I will show how to use mathematical and set theory methodologies to investigate the browsing habits of Internauts. By using a two weeks dataset, I will show how over this period, the Internauts continue
discovering new websites. Moreover, I will show that by using only DNS information to build a profile, it is hard to build a reliable profiler. Instead, by exploiting mathematical and statistical tools, I will show how to characterize Anycast-enabled CDNs (A-CDNs). I will show that A-CDNs are widely used either for stateless and stateful services. That A-CDNs are quite popular, as, more than 50% of web users contact an A-CDN every day. And that, stateful services, can benefit of A-CDNs, since their paths are very stable over time, as demonstrated by the presence of only a few anomalies in their Round Trip Time.
Finally, I will conclude by showing how I used BGPStream an open-source software framework for the analysis of both historical and real-time Border Gateway Protocol (BGP) measurement data. By using BGPStream in real-time mode I will show how I detected a Multiple Origin AS (MOAS) event, and how I studies the black-holing community propagation, showing the effect of this community in the network. Then, by using BGPStream in historical mode, and the Apache Spark big data framework over 16 years of data, I will show different results such as the continuous growth of IPv4 prefixes, and the growth of MOAS events over time.
All these studies have the aim of showing how monitoring is a fundamental task in different scenarios. In particular, highlighting the importance of machine learning and of big data methodologies
STOCHASTIC MODELING AND TIME-TO-EVENT ANALYSIS OF VOIP TRAFFIC
Voice over IP (VoIP) systems are gaining increased popularity due to the cost effectiveness, ease of management, and enhanced features and capabilities. Both enterprises and carriers are deploying VoIP systems to replace their TDM-based legacy voice networks. However, the lack of engineering models for VoIP systems has been realized by many researchers, especially for large-scale networks. The purpose of traffic engineering is to minimize call blocking probability and maximize resource utilization. The current traffic engineering models are inherited from the legacy PSTN world, and these models fall short from capturing the characteristics of new traffic patterns. The objective of this research is to develop a traffic engineering model for modern VoIP networks. We studied the traffic on a large-scale VoIP network and collected several billions of call information. Our analysis shows that the traditional traffic engineering approach based on the Poisson call arrival process and exponential holding time fails to capture the modern telecommunication systems accurately. We developed a new framework for modeling call arrivals as a non-homogeneous Poisson process, and we further enhanced the model by providing a Gaussian approximation for the cases of heavy traffic condition on large-scale networks. In the second phase of the research, we followed a new time-to-event survival analysis approach to model call holding time as a generalized gamma distribution and we introduced a Call Cease Rate function to model the call durations. The modeling and statistical work of the Call Arrival model and the Call Holding Time model is constructed, verified and validated using hundreds of millions of real call information collected from an operational VoIP carrier network. The traffic data is a mixture of residential, business, and wireless traffic. Therefore, our proposed models can be applied to any modern telecommunication system. We also conducted sensitivity analysis of model parameters and performed statistical tests on the robustness of the models’ assumptions.
We implemented the models in a new simulation-based traffic engineering system called VoIP Traffic Engineering Simulator (VSIM). Advanced statistical and stochastic techniques were used in building VSIM system. The core of VSIM is a simulation system that consists of two different simulation engines: the NHPP parametric simulation engine and the non-parametric simulation engine. In addition, VSIM provides several subsystems for traffic data collection, processing, statistical modeling, model parameter estimation, graph generation, and traffic prediction. VSIM is capable of extracting traffic data from a live VoIP network, processing and storing the extracted information, and then feeding it into one of the simulation engines which in turn provides resource optimization and quality of service reports
Network configuration improvement and design aid using artificial intelligence
This dissertation investigates the development of new Global system for mobile communications (GSM) improvement algorithms used to solve the nondeterministic polynomial-time hard (NP-hard) problem of assigning cells to switches. The departure of this project from previous projects is in the area of the GSM network being optimised. Most previous projects tried minimising the signalling load on the network. The main aim in this project is to reduce the operational expenditure as much as possible while still adhering to network element constraints. This is achieved by generating new network configurations with a reduced transmission cost. Since assigning cells to switches in cellular mobile networks is a NP-hard problem, exact methods cannot be used to solve it for real-size networks. In this context, heuristic approaches, evolutionary search algorithms and clustering techniques can, however, be used. This dissertation presents a comprehensive and comparative study of the above-mentioned categories of search techniques adopted specifically for GSM network improvement. The evolutionary search technique evaluated is a genetic algorithm (GA) while the unsupervised learning technique is a Gaussian mixture model (GMM). A number of custom-developed heuristic search techniques with differing goals were also experimented with. The implementation of these algorithms was tested in order to measure the quality of the solutions. Results obtained confirmed the ability of the search techniques to produce network configurations with a reduced operational expenditure while still adhering to network element constraints. The best results found were using the Gaussian mixture model where savings of up to 17% were achieved. The heuristic searches produced promising results in the form of the characteristics they portray, for example, load-balancing. Due to the massive problem space and a suboptimal chromosome representation, the genetic algorithm struggled to find high quality viable solutions. The objective of reducing network cost was achieved by performing cell-to-switch optimisation taking traffic distributions, transmission costs and network element constraints into account. These criteria cannot be divorced from each other since they are all interdependent, omitting any one of them will lead to inefficient and infeasible configurations. Results obtained further indicated that the search space consists out of two components namely, traffic and transmission cost. When optimising, it is very important to consider both components simultaneously, if not, infeasible or suboptimum solutions are generated. It was also found that pre-processing has a major impact on the cluster-forming ability of the GMM. Depending on how the pre-processing technique is set up, it is possible to bias the cluster-formation process in such a way that either transmission cost savings or a reduction in inter base station controller/switching centre traffic volume is given preference. Two of the difficult questions to answer when performing network capacity expansions are where to install the remote base station controllers (BSCs) and how to alter the existing BSC boundaries to accommodate the new BSCs being introduced. Using the techniques developed in this dissertation, these questions can now be answered with confidence.Dissertation (MEng)--University of Pretoria, 2008.Electrical, Electronic and Computer Engineeringunrestricte
Crowdsourced network measurements: Benefits and best practices
Network measurements are of high importance both for the operation of networks and for the design and evaluation of new management mechanisms. Therefore, several approaches exist for running network measurements, ranging from analyzing live traffic traces from campus or Internet Service Provider (ISP) networks to performing active measurements on distributed testbeds, e.g., PlanetLab, or involving volunteers. However, each method falls short, offering only a partial view of the network. For instance, the scope of passive traffic traces is limited to an ISP’s network and customers’ habits, whereas active measurements might be biased by the population or node location involved. To complement these techniques, we propose to use (commercial) crowdsourcing platforms for network measurements. They permit a controllable, diverse and realistic view of the Internet and provide better control than do measurements with voluntary participants. In this study, we compare crowdsourcing with traditional measurement techniques, describe possible pitfalls and limitations, and present best practices to overcome these issues. The contribution of this paper is a guideline for researchers to understand when and how to exploit crowdsourcing for network measurements
Rethinking Routing and Peering in the era of Vertical Integration of Network Functions
Content providers typically control the digital content consumption services and are getting the most revenue by implementing an all-you-can-eat model via subscription or hyper-targeted advertisements. Revamping the existing Internet architecture and design, a vertical integration where a content provider and access ISP will act as unibody in a sugarcane form seems to be the recent trend. As this vertical integration trend is emerging in the ISP market, it is questionable if existing routing architecture will suffice in terms of sustainable economics, peering, and scalability. It is expected that the current routing will need careful modifications and smart innovations to ensure effective and reliable end-to-end packet delivery. This involves new feature developments for handling traffic with reduced latency to tackle routing scalability issues in a more secure way and to offer new services at cheaper costs. Considering the fact that prices of DRAM or TCAM in legacy routers are not necessarily decreasing at the desired pace, cloud computing can be a great solution to manage the increasing computation and memory complexity of routing functions in a centralized manner with optimized expenses. Focusing on the attributes associated with existing routing cost models and by exploring a hybrid approach to SDN, we also compare recent trends in cloud pricing (for both storage and service) to evaluate whether it would be economically beneficial to integrate cloud services with legacy routing for improved cost-efficiency. In terms of peering, using the US as a case study, we show the overlaps between access ISPs and content providers to explore the viability of a future in terms of peering between the new emerging content-dominated sugarcane ISPs and the healthiness of Internet economics. To this end, we introduce meta-peering, a term that encompasses automation efforts related to peering – from identifying a list of ISPs likely to peer, to injecting control-plane rules, to continuous monitoring and notifying any violation – one of the many outcroppings of vertical integration procedure which could be offered to the ISPs as a standalone service
- …