34 research outputs found
Agile Principles Applied to a Complex Long Term Research Activity - The PERIMETER Approach
Agile software development is a group of software development methodologies that are based on similar principles, as defined in the Agile Manifesto. Agile software projects are characterized by iterative and incremental development, accommodation of changes and active customer participation.
The popularity of agile principles is steadily increasing. Their adopters report that this development process leads to higher software quality and customer satisfaction ratings when compared to using traditional methods, with more productive and motivated developers. Whilst smaller developer teams have cited higher success rates than larger teams, agile principles can and have been applied successfully to large scale projects and distributed teams.
Despite these advantages, there are very few research activities that apply agile principles in their development. Perhaps this is due to the nature of research projects, which usually span years rather than months, frequently involve experimental work, and consist of team members with varying levels of experience, often coming from different organizations, research groups and countries. This paper examines how agile principles can be adapted to suit one such long term research activity; PERIMETER
GT: Picking up the Truth from the Ground for Internet Traffic
Much of Internet traffic modeling, firewall, and intrusion detection research requires traces where some ground truth regarding application and protocol is associated with each packet or flow. This paper presents the design, development and experimental evaluation of gt, an open source software toolset for associating ground truth information with Internet traffic traces. By probing the monitored host's kernel to obtain information on active Internet sessions, gt gathers ground truth at the application level. Preliminary exper- imental results show that gt's effectiveness comes at little cost in terms of overhead on the hosting machines. Furthermore, when coupled with other packet inspection mechanisms, gt can derive ground truth not only in terms of applications (e.g., e-mail), but also in terms of protocols (e.g., SMTP vs. POP3
Support Vector Machines for TCP Traffic Classification
Support Vector Machines (SVM) represent one of the most promising Machine Learning (ML) tools that can be applied to the problem of traffic classification in IP networks. In the case of SVMs, there are still open questions that need to be addressed before they can be generally applied to traffic classifiers. Having being designed essentially as techniques for binary classification, their generalization to multi-class problems is still under research. Furthermore, their performance is highly susceptible to the correct optimization of their working parameters. In this paper we describe an approach to traffic classification based on SVM. We apply one of the approaches to solving multi-class problems with SVMs to the task of statistical traffic classification, and describe a simple optimization algorithm that allows the classifier to perform correctly with as little training as a few hundred samples. The accuracy of the proposed classifier is then evaluated over three sets of traffic traces, coming from different topological points in the Internet. Although the results are relatively preliminary, they confirm that SVM-based classifiers can be very effective at discriminating traffic generated by different applications, even with reduced training set sizes. (C) 2009 Elsevier B.V. All rights reserved
Support Vector Machines for TCP traffic classification
Support Vector Machines (SVM) represent one of the most promising Machine Learning (ML) tools that can be applied to the problem of traffic classification in IP networks. In the case of SVMs, there are still open questions that need to be addressed before they can be generally applied to traffic classifiers. Having being designed essentially as techniques for binary classification, their generalization to multi-class problems is still under research. Furthermore, their performance is highly susceptible to the correct optimization of their working parameters. In this paper we describe an approach to traffic classification based on SVM. We apply one of the approaches to solving multi-class problems with SVMs to the task of statistical traffic classification, and describe a simple optimization algorithm that allows the classifier to perform correctly with as little training as a few hundred samples. The accuracy of the proposed classifier is then evaluated over three sets of traffic traces, coming from different topological points in the Internet. Although the results are relatively preliminary, they confirm that SVM-based classifiers can be very effective at discriminating traffic generated by different applications, even with reduced training set sizes
Comparing Traffic Classifiers
This article is an editorial note submitted to CCR. It has NOT been peer reviewed. Authors take full responsibility for this article’s technical content. Comments can be posted through CCR Online. Many reputable research groups have published several interesting papers on traffic classification, proposing mechanisms of different nature. However, it is our opinion that this community should now find an objective and scientific way of comparing results coming out of different groups. We see at least two hurdles before this can happen. A major issue is that we need to find ways to share full-payload data sets, or, if that does not prove to be feasible, at least anonymized traces with complete application layer meta-data. A relatively minor issue refers to finding an agreement on which metric should be used to evaluate the performance of the classifiers. In this note we argue that these are two important issues that the community should address, and sketch a few solutions to foster the discussion on these topics
Optimizing Statistical Classifiers of Network Traffic
Supervised statistical approaches for the classification of network traffic are quickly moving from research laboratories to advanced prototypes, which in turn will become actual products in the next few years. While the research on the classification algorithms themselves has made quite significant progress in the recent past, few papers have examined the problem of determining the optimum working parameters for statistical classifiers in a straightforward and foolproof way. Without such optimization, it becomes very difficult to put into practice any classification algorithm for network traffic, no matter how advanced it may be. In this paper we present a simple but effiective procedure for the optimization of the working parameters of a statistical network traffic classifier. We put the optimization procedure into practice, and examine its effects when the classifier is run in very different scenarios, ranging from medium and large local area networks to Internet backbone links. Experimental results show not only that an automatic optimization procedure like the one presented in this paper is necessary for the classifier to work at its best, but they also shed some light on some of the properties of the classification algorithm that deserve further study