8,826 research outputs found

    On the performance of QUIC over wireless mesh networks

    Get PDF
    The exponential growth in adoption of mobile phones and the widespread availability of wireless networks has caused a paradigm shift in the way we access the Internet. It has not only eased access to the Internet, but also increased users’ appetite for responsive services. New protocols to speed up Internet applications have naturally emerged. The QUIC transport protocol is one prominent case. Initially developed by Google as an experiment, the protocol has already made phenomenal strides, thanks to its support in Google’s servers and Chrome browser. Since QUIC is still a relatively new protocol, there is a lack of sufficient understanding about its behavior in real network scenarios, particularly in the case of wireless networks. In this paper we present a comprehensive study on the performance of QUIC in Wireless Mesh Networks (WMN). We perform a measurement campaign on a production WMN to compare the performance of QUIC against TCP when retrieving files from the Internet. Our results show that while QUIC outperforms TCP in wired networks, it exhibits significantly lower performance than TCP in the WMN. We investigate the reasons for this behavior and identify the root causes of the performance issues. We find that some design choices of QUIC may penalize the protocol in WiFi, e.g., uncovering sub-optimal interactions of QUIC with MAC layer features, such as frame aggregation. Finally, we implement and evaluate our solution and demonstrate up to 28% increase in throughput of QUIC.This work was supported by the Erasmus Mundus Joint Doctorate in Distributed Computing EMJD-DC program, the Spanish grant TIN2016-77836-C2-2-R, and Generalitat de Catalunya through 2017-SGR-990. This research was conducted as part of the PhD thesis which is available online at upcommons.upc.edu.Peer ReviewedPostprint (author's final draft

    An Overview on Application of Machine Learning Techniques in Optical Networks

    Get PDF
    Today's telecommunication networks have become sources of enormous amounts of widely heterogeneous data. This information can be retrieved from network traffic traces, network alarms, signal quality indicators, users' behavioral data, etc. Advanced mathematical tools are required to extract meaningful information from these data and take decisions pertaining to the proper functioning of the networks from the network-generated data. Among these mathematical tools, Machine Learning (ML) is regarded as one of the most promising methodological approaches to perform network-data analysis and enable automated network self-configuration and fault management. The adoption of ML techniques in the field of optical communication networks is motivated by the unprecedented growth of network complexity faced by optical networks in the last few years. Such complexity increase is due to the introduction of a huge number of adjustable and interdependent system parameters (e.g., routing configurations, modulation format, symbol rate, coding schemes, etc.) that are enabled by the usage of coherent transmission/reception technologies, advanced digital signal processing and compensation of nonlinear effects in optical fiber propagation. In this paper we provide an overview of the application of ML to optical communications and networking. We classify and survey relevant literature dealing with the topic, and we also provide an introductory tutorial on ML for researchers and practitioners interested in this field. Although a good number of research papers have recently appeared, the application of ML to optical networks is still in its infancy: to stimulate further work in this area, we conclude the paper proposing new possible research directions

    What broke where for distributed and parallel applications — a whodunit story

    Get PDF
    Detection, diagnosis and mitigation of performance problems in today\u27s large-scale distributed and parallel systems is a difficult task. These large distributed and parallel systems are composed of various complex software and hardware components. When the system experiences some performance or correctness problem, developers struggle to understand the root cause of the problem and fix in a timely manner. In my thesis, I address these three components of the performance problems in computer systems. First, we focus on diagnosing performance problems in large-scale parallel applications running on supercomputers. We developed techniques to localize the performance problem for root-cause analysis. Parallel applications, most of which are complex scientific simulations running in supercomputers, can create up to millions of parallel tasks that run on different machines and communicate using the message passing paradigm. We developed a highly scalable and accurate automated debugging tool called PRODOMETER, which uses sophisticated algorithms to first, create a logical progress dependency graph of the tasks to highlight how the problem spread through the system manifesting as a system-wide performance issue. Second, uses this logical progress dependence graph to identify the task where the problem originated. Finally, PRODOMETER pinpoints the code region corresponding to the origin of the bug. Second, we developed a tool-chain that can detect performance anomaly using machine-learning techniques and can achieve very low false positive rate. Our input-aware performance anomaly detection system consists of a scalable data collection framework to collect performance related metrics from different granularity of code regions, an offline model creation and prediction-error characterization technique, and a threshold based anomaly-detection-engine for production runs. Our system requires few training runs and can handle unknown inputs and parameter combinations by dynamically calibrating the anomaly detection threshold according to the characteristics of the input data and the characteristics of the prediction-error of the models. Third, we developed performance problem mitigation scheme for erasure-coded distributed storage systems. Repair operations of the failed blocks in erasure-coded distributed storage system take really long time in networked constrained data-centers. The reason being, during the repair operation for erasure-coded distributed storage, a lot of data from multiple nodes are gathered into a single node and then a mathematical operation is performed to reconstruct the missing part. This process severely congests the links toward the destination where newly recreated data is to be hosted. We proposed a novel distributed repair technique, called Partial-Parallel-Repair (PPR) that performs this reconstruction in parallel on multiple nodes and eliminates network bottlenecks, and as a result, greatly speeds up the repair process. Fourth, we study how for a class of applications, performance can be improved (or performance problems can be mitigated) by selectively approximating some of the computations. For many applications, the main computation happens inside a loop that can be logically divided into a few temporal segments, we call phases. We found that while approximating the initial phases might severely degrade the quality of the results, approximating the computation for the later phases have very small impact on the final quality of the result. Based on this observation, we developed an optimization framework that for a given budget of quality-loss, would find the best approximation settings for each phase in the execution

    Anomaly detection and classification in traffic flow data from fluctuations in the flow-density relationship

    Get PDF
    We describe and validate a novel data-driven approach to the real time detection and classification of traffic anomalies based on the identification of atypical fluctuations in the relationship between density and flow. For aggregated data under stationary conditions, flow and density are related by the fundamental diagram. However, high resolution data obtained from modern sensor networks is generally non-stationary and disaggregated. Such data consequently show significant statistical fluctuations. These fluctuations are best described using a bivariate probability distribution in the density-flow plane. By applying kernel density estimation to high-volume data from the UK National Traffic Information Service (NTIS), we empirically construct these distributions for London's M25 motorway. Curves in the density-flow plane are then constructed, analogous to quantiles of univariate distributions. These curves quantitatively separate atypical fluctuations from typical traffic states. Although the algorithm identifies anomalies in general rather than specific events, we find that fluctuations outside the 95\% probability curve correlate strongly with the spikes in travel time associated with significant congestion events. Moreover, the size of an excursion from the typical region provides a simple, real-time measure of the severity of detected anomalies. We validate the algorithm by benchmarking its ability to identify labelled events in historical NTIS data against some commonly used methods from the literature. Detection rate, time-to-detect and false alarm rate are used as metrics and found to be generally comparable except in situations when the speed distribution is bi-modal. In such situations, the new algorithm achieves a much lower false alarm rate without suffering significant degradation on the other metrics. This method has the additional advantage of being self-calibrating.Comment: 23 pages, 12 figure
    • …
    corecore