16 research outputs found

    Automated Anomaly Detection in Virtualized Services Using Deep Packet Inspection

    Get PDF
    Virtualization technologies have proven to be important drivers for the fast and cost-efficient development and deployment of services. While the benefits are tremendous, there are many challenges to be faced when developing or porting services to virtualized infrastructure. Especially critical applications like Virtualized Network Functions must meet high requirements in terms of reliability and resilience. An important tool when meeting such requirements is detecting anomalous system components and recovering the anomaly before it turns into a fault and subsequently into a failure visible to the client. Anomaly detection for virtualized services relies on collecting system metrics that represent the normal operation state of every component and allow the usage of machine learning algorithms to automatically build models representing such state. This paper presents an approach for collecting service-layer metrics while treating services as black-boxes. This allows service providers to implement anomaly detection on the application layer without the need to modify third-party software. Deep Packet Inspection is used to analyse the traffic of virtual machines on the hypervisor layer, producing both generic and protocol-specific communication metrics. An evaluation shows that the resulting metrics represent the normal operation state of an example Virtualized Network Function and are therefore a valuable contribution to automatic anomaly detection in virtualized services

    DPI over commodity hardware: implementation of a scalable framework using FastFlow

    Get PDF
    In the last years we assisted to a large increase of the number of applications running on top of IP networks. Consequently the need to implement very efficient monitoring solutions that can manage these high data rates and that can classify the type of traffic which is traveling over the network has increased. For example, as far as network security is concerned, in the recent years we have seen a shift from so-called "network-level" attacks, which target the network they are transported on (e.g. Denial of Service), to content-based threats which exploit applications vulnerabilities and require sophisticated levels of intelligence to be detected. For some of these threats, it is no more sufficient to have only a software solution on the client side but we also need to run some controls on the network itself. To manage these kinds of scenarios, payload inspection is often required in order to correctly identify the application protocol and to process the data carried over it. This is the reason why, in recent years, Deep Packet Inspection (DPI) technology has emerged. This kind of processing is in many cases implemented, at least in part, through dedicated hardware. However, full software solutions may often be more appealing because they are typically more economical and have, in general, the capability to react faster to protocols evolution and changes. Moreover, software solutions which run over general purpose hardware do not exploit the underlying multiprocessor architecture, providing only the capability to process the incoming packets sequentially. Furthermore, many DPI research works that can be found in literature and which exploits multicore architectures are often characterized by a poor scalability, due to the overhead required for synchronization and to load unbalance among the used cores. In this thesis, we will describe the design and implementation of a DPI framework capable of managing current networks rates using commodity multicore hardware. Our framework provides the possibility to identify the protocol, to specify the kind of data to extract when it has been identified and how these data has to be processed. Differently from existing works, the developed framework has been designed according to the structured parallel programming theory, allowing thus to completely hide to the user the complexity of the management of the problems related to an efficient exploitation of the underlying architecture. These concepts have then been applied using FastFlow, a library for structured parallel programming targeting both shared memory and distributed memory architectures

    Independent comparison of popular DPI tools for traffic classification

    Get PDF
    Deep Packet Inspection (DPI) is the state-of-the-art technology for traffic classification. According to the conventional wisdom, DPI is the most accurate classification technique. Consequently, most popular products, either commercial or open-source, rely on some sort of DPI for traffic classification. However, the actual performance of DPI is still unclear to the research community, since the lack of public datasets prevent the comparison and reproducibility of their results. This paper presents a comprehensive comparison of 6 well-known DPI tools, which are commonly used in the traffic classification literature. Our study includes 2 commercial products (PACE and NBAR) and 4 open-source tools (OpenDPI, L7-filter, nDPI, and Libprotoident). We studied their performance in various scenarios (including packet and flow truncation) and at different classification levels (application protocol, application and web service). We carefully built a labeled dataset with more than 750 K flows, which contains traffic from popular applications. We used the Volunteer-Based System (VBS), developed at Aalborg University, to guarantee the correct labeling of the dataset. We released this dataset, including full packet payloads, to the research community. We believe this dataset could become a common benchmark for the comparison and validation of network traffic classifiers. Our results present PACE, a commercial tool, as the most accurate solution. Surprisingly, we find that some open-source tools, such as nDPI and Libprotoident, also achieve very high accuracy.Peer ReviewedPostprint (author’s final draft

    Internet traffic classification for high-performance and off-the-shelf systems

    Full text link
    Tesis doctoral inédita, leída en la Universidad Autónoma de Madrid, Escuela Politécnica Superior, Departamento de Tecnología Electrónica y de las Comunicaciones, 2013

    Classification and Analysis of Computer Network Traffic

    Get PDF

    Optimizations and Cost Models for multi-core architectures: an approach based on parallel paradigms

    Get PDF
    The trend in modern microprocessor architectures is clear: multi-core chips are here to stay, and researchers expect multiprocessors with 128 to 1024 cores on a chip in some years. Yet the software community is slowly taking the path towards parallel programming: while some works target multi-cores, these are usually inherited from the previous tools for SMP architectures, and rarely exploit specific characteristics of multi-cores. But most important, current tools have no facilities to guarantee performance or portability among architectures. Our research group was one of the first to propose the structured parallel programming approach to solve the problem of performance portability and predictability. This has been successfully demonstrated years ago for distributed and shared memory multiprocessors, and we strongly believe that the same should be applied to multi-core architectures. The main problem with performance portability is that optimizations are effective only under specific conditions, making them dependent on both the specific program and the target architecture. For this reason in current parallel programming (in general, but especially with multi-cores) optimizations usually follows a try-and-decide approach: each one must be implemented and tested on the specific parallel program to understand its benefits. If we want to make a step forward and really achieve some form of performance portability, we require some kind of prediction of the expected performance of a program. The concept of performance modeling is quite old in the world of parallel programming; yet, in the last years, this kind of research saw small improvements: cost models to describe multi-cores are missing, mainly because of the increasing complexity of microarchitectures and the poor knowledge of specific implementation details of current processors. In the first part of this thesis we prove that the way of performance modeling is still feasible, by studying the Tilera TilePro64. The high number of cores on-chip in this processor (64) required the use of several innovative solutions, such as a complex interconnection network and the use of multiple memory interfaces per chip. For these features the TilePro64 can be considered an insight of what to expect in future multi-core processors. The availability of a cycle-accurate simulator and an extensive documentation allowed us to model the architecture, and in particular its memory subsystem, at the accuracy level required to compare optimizations In the second part, focused on optimizations, we cover one of the most important issue of multi-core architectures: the memory subsystem. In this area multi-core strongly differs in their structure w.r.t off-chip parallel architectures, both SMP and NUMA, thus opening new opportunities. In detail, we investigate the problem of data distribution over the memory controllers in several commercial multi-cores, and the efficient use of the cache coherency mechanisms offered by the TilePro64 processor. Finally, by using the performance model, we study different implementations, derived from the previous optimizations, of a simple test-case application. We are able to predict the best version using only profiled data from a sequential execution. The accuracy of the model has been verified by experimentally comparing the implementations on the real architecture, giving results within 1 − 2% of accuracy
    corecore