Search CORE

12 research outputs found

Extreme bandits

Author: Carpentier Alexandra
Valko Michal
Publication venue: HAL CCSD
Publication date: 07/12/2014
Field of study

International audienceIn many areas of medicine, security, and life sciences, we want to allocate limited resources to different sources in order to detect extreme values. In this paper, we study an efficient way to allocate these resources sequentially under limited feedback. While sequential design of experiments is well studied in bandit theory, the most commonly optimized property is the regret with respect to the maximum mean reward. However, in other problems such as network intrusion detection, we are interested in detecting the most extreme value output by the sources. Therefore, in our work we study extreme regret which measures the efficiency of an algorithm compared to the oracle policy selecting the source with the heaviest tail. We propose the ExtremeHunter algorithm, provide its analysis, and evaluate it empirically on synthetic and real-world experiments

HAL - Lille 3

INRIA a CCSD electronic archive server

Unsupervised Machine Learning for Networking:Techniques, Applications and Research Challenges

Author: Al-Fuqaha Ala
Arif Hunain
Elkhatib Yehia
Hussain Amir
Qadir Junaid
Raza Aunn
Usama Muhammad
Yau Kok-lim Alvin
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 14/05/2019
Field of study

While machine learning and artificial intelligence have long been applied in networking research, the bulk of such works has focused on supervised learning. Recently, there has been a rising trend of employing unsupervised machine learning using unstructured raw network data to improve network performance and provide services such as traffic engineering, anomaly detection, Internet traffic classification, and quality of service optimization. The interest in applying unsupervised learning techniques in networking emerges from their great success in other fields such as computer vision, natural language processing, speech recognition, and optimal control (e.g., for developing autonomous self-driving cars). Unsupervised learning is interesting since it can unconstrain us from the need of labeled data and manual handcrafted feature engineering thereby facilitating flexible, general, and automated methods of machine learning. The focus of this survey paper is to provide an overview of the applications of unsupervised learning in the domain of networking. We provide a comprehensive survey highlighting the recent advancements in unsupervised learning techniques and describe their applications in various learning tasks in the context of networking. We also provide a discussion on future directions and open research issues, while also identifying potential pitfalls. While a few survey papers focusing on the applications of machine learning in networking have previously been published, a survey of similar scope and breadth is missing in literature. Through this paper, we advance the state of knowledge by carefully synthesizing the insights from these survey papers while also providing contemporary coverage of recent advances

Lancaster E-Prints

Unsupervised Machine Learning for Networking:Techniques, Applications and Research Challenges

Author: Al-Fuqaha Ala
Arif Hunain
Elkhatib Yehia
Hussain Amir
Qadir Junaid
Raza Aunn
Usama Muhammad
Yau Kok-Lim Alvin
Publication venue
Publication date: 19/09/2017
Field of study

While machine learning and artificial intelligence have long been applied in networking research, the bulk of such works has focused on supervised learning. Recently there has been a rising trend of employing unsupervised machine learning using unstructured raw network data to improve network performance and provide services such as traffic engineering, anomaly detection, Internet traffic classification, and quality of service optimization. The interest in applying unsupervised learning techniques in networking emerges from their great success in other fields such as computer vision, natural language processing, speech recognition, and optimal control (e.g., for developing autonomous self-driving cars). Unsupervised learning is interesting since it can unconstrain us from the need of labeled data and manual handcrafted feature engineering thereby facilitating flexible, general, and automated methods of machine learning. The focus of this survey paper is to provide an overview of the applications of unsupervised learning in the domain of networking. We provide a comprehensive survey highlighting the recent advancements in unsupervised learning techniques and describe their applications for various learning tasks in the context of networking. We also provide a discussion on future directions and open research issues, while also identifying potential pitfalls. While a few survey papers focusing on the applications of machine learning in networking have previously been published, a survey of similar scope and breadth is missing in literature. Through this paper, we advance the state of knowledge by carefully synthesizing the insights from these survey papers while also providing contemporary coverage of recent advances

arXiv.org e-Print Archive

Lancaster E-Prints

Improving Pan-African research and education networks through traffic engineering: A LISP/SDN approach

Author: Chavula Josiah
Publication venue: Faculty Science: ICTC4D
Publication date: 01/01/2017
Field of study

The UbuntuNet Alliance, a consortium of National Research and Education Networks (NRENs) runs an exclusive data network for education and research in east and southern Africa. Despite a high degree of route redundancy in the Alliance's topology, a large portion of Internet traffic between the NRENs is circuitously routed through Europe. This thesis proposes a performance-based strategy for dynamic ranking of inter-NREN paths to reduce latencies. The thesis makes two contributions: firstly, mapping Africa's inter-NREN topology and quantifying the extent and impact of circuitous routing; and, secondly, a dynamic traffic engineering scheme based on Software Defined Networking (SDN), Locator/Identifier Separation Protocol (LISP) and Reinforcement Learning. To quantify the extent and impact of circuitous routing among Africa's NRENs, active topology discovery was conducted. Traceroute results showed that up to 75% of traffic from African sources to African NRENs went through inter-continental routes and experienced much higher latencies than that of traffic routed within Africa. An efficient mechanism for topology discovery was implemented by incorporating prior knowledge of overlapping paths to minimize redundancy during measurements. Evaluation of the network probing mechanism showed a 47% reduction in packets required to complete measurements. An interactive geospatial topology visualization tool was designed to evaluate how NREN stakeholders could identify routes between NRENs. Usability evaluation showed that users were able to identify routes with an accuracy level of 68%. NRENs are faced with at least three problems to optimize traffic engineering, namely: how to discover alternate end-to-end paths; how to measure and monitor performance of different paths; and how to reconfigure alternate end-to-end paths. This work designed and evaluated a traffic engineering mechanism for dynamic discovery and configuration of alternate inter-NREN paths using SDN, LISP and Reinforcement Learning. A LISP/SDN based traffic engineering mechanism was designed to enable NRENs to dynamically rank alternate gateways. Emulation-based evaluation of the mechanism showed that dynamic path ranking was able to achieve 20% lower latencies compared to the default static path selection. SDN and Reinforcement Learning were used to enable dynamic packet forwarding in a multipath environment, through hop-by-hop ranking of alternate links based on latency and available bandwidth. The solution achieved minimum latencies with significant increases in aggregate throughput compared to static single path packet forwarding. Overall, this thesis provides evidence that integration of LISP, SDN and Reinforcement Learning, as well as ranking and dynamic configuration of paths could help Africa's NRENs to minimise latencies and to achieve better throughputs

Cape Town University OpenUCT

Large-Scale Networks: Algorithms, Complexity and Real Applications

Author: MAINARDI SIMONE
Publication venue: 'Pisa University Press'
Publication date: 21/05/2014
Field of study

Networks have broad applicability to real-world systems, due to their ability to model and represent complex relationships. The discovery and forecasting of insightful patterns from networks are at the core of analytical intelligence in government, industry, and science. Discoveries and forecasts, especially from large-scale networks commonly available in the big-data era, strongly rely on fast and efficient network algorithms. Algorithms for dealing with large-scale networks are the first topic of research we focus on in this thesis. We design, theoretically analyze and implement efficient algorithms and parallel algorithms, rigorously proving their worst-case time and space complexities. Our main contributions in this area are novel, parallel algorithms to detect k-clique communities, special network groups which are widely used to understand complex phenomena. The proposed algorithms have a space complexity which is the square root of that of the current state-of-the-art. Time complexity achieved is optimal, since it is inversely proportional to the number of processing units available. Extensive experiments were conducted to confirm the efficiency of the proposed algorithms, even in comparison to the state-of-the-art. We experimentally measured a linear speedup, substantiating the optimal performances attained. The second focus of this thesis is the application of networks to discover insights from real-world systems. We introduce novel methodologies to capture cross correlations in evolving networks. We instantiate these methodologies to study the Internet, one of the most, if not the most, pervasive modern technological system. We investigate the dynamics of connectivity among Internet companies, those which interconnect to ensure global Internet access. We then combine connectivity dynamics with historical worldwide stock markets data, and produce graphical representations to visually identify high correlations. We find that geographically close Internet companies offering similar services are driven by common economic factors. We also provide evidence on the existence and nature of hidden factors governing the dynamics of Internet connectivity. Finally, we propose network models to effectively study the Internet Domain Name System (DNS) traffic, and leverage these models to obtain rankings of Internet domains as well as to identify malicious activities

Electronic Thesis and Dissertation Archive - Università di Pisa

Improved Detection for Advanced Polymorphic Malware

Author: Fraley James B.
Publication venue: NSUWorks
Publication date: 01/01/2017
Field of study

Malicious Software (malware) attacks across the internet are increasing at an alarming rate. Cyber-attacks have become increasingly more sophisticated and targeted. These targeted attacks are aimed at compromising networks, stealing personal financial information and removing sensitive data or disrupting operations. Current malware detection approaches work well for previously known signatures. However, malware developers utilize techniques to mutate and change software properties (signatures) to avoid and evade detection. Polymorphic malware is practically undetectable with signature-based defensive technologies. Today’s effective detection rate for polymorphic malware detection ranges from 68.75% to 81.25%. New techniques are needed to improve malware detection rates. Improved detection of polymorphic malware can only be accomplished by extracting features beyond the signature realm. Targeted detection for polymorphic malware must rely upon extracting key features and characteristics for advanced analysis. Traditionally, malware researchers have relied on limited dimensional features such as behavior (dynamic) or source/execution code analysis (static). This study’s focus was to extract and evaluate a limited set of multidimensional topological data in order to improve detection for polymorphic malware. This study used multidimensional analysis (file properties, static and dynamic analysis) with machine learning algorithms to improve malware detection. This research demonstrated improved polymorphic malware detection can be achieved with machine learning. This study conducted a number of experiments using a standard experimental testing protocol. This study utilized three advanced algorithms (Metabagging (MB), Instance Based k-Means (IBk) and Deep Learning Multi-Layer Perceptron) with a limited set of multidimensional data. Experimental results delivered detection results above 99.43%. In addition, the experiments delivered near zero false positives. The study’s approach was based on single case experimental design, a well-accepted protocol for progressive testing. The study constructed a prototype to automate feature extraction, assemble files for analysis, and analyze results through multiple clustering algorithms. The study performed an evaluation of large malware sample datasets to understand effectiveness across a wide range of malware. The study developed an integrated framework which automated feature extraction for multidimensional analysis. The feature extraction framework consisted of four modules: 1) a pre-process module that extracts and generates topological features based on static analysis of machine code and file characteristics, 2) a behavioral analysis module that extracts behavioral characteristics based on file execution (dynamic analysis), 3) an input file construction and submission module, and 4) a machine learning module that employs various advanced algorithms. As with most studies, careful attention was paid to false positive and false negative rates which reduce their overall detection accuracy and effectiveness. This study provided a novel approach to expand the malware body of knowledge and improve the detection for polymorphic malware targeting Microsoft operating systems

NSU Works

Online learning on the programmable dataplane

Author: Simpson Kyle Andrew
Publication venue
Publication date: 01/01/2022
Field of study

This thesis makes the case for managing computer networks with datadriven methods automated statistical inference and control based on measurement data and runtime observations—and argues for their tight integration with programmable dataplane hardware to make management decisions faster and from more precise data. Optimisation, defence, and measurement of networked infrastructure are each challenging tasks in their own right, which are currently dominated by the use of hand-crafted heuristic methods. These become harder to reason about and deploy as networks scale in rates and number of forwarding elements, but their design requires expert knowledge and care around unexpected protocol interactions. This makes tailored, per-deployment or -workload solutions infeasible to develop. Recent advances in machine learning offer capable function approximation and closed-loop control which suit many of these tasks. New, programmable dataplane hardware enables more agility in the network— runtime reprogrammability, precise traffic measurement, and low latency on-path processing. The synthesis of these two developments allows complex decisions to be made on previously unusable state, and made quicker by offloading inference to the network. To justify this argument, I advance the state of the art in data-driven defence of networks, novel dataplane-friendly online reinforcement learning algorithms, and in-network data reduction to allow classification of switchscale data. Each requires co-design aware of the network, and of the failure modes of systems and carried traffic. To make online learning possible in the dataplane, I use fixed-point arithmetic and modify classical (non-neural) approaches to take advantage of the SmartNIC compute model and make use of rich device local state. I show that data-driven solutions still require great care to correctly design, but with the right domain expertise they can improve on pathological cases in DDoS defence, such as protecting legitimate UDP traffic. In-network aggregation to histograms is shown to enable accurate classification from fine temporal effects, and allows hosts to scale such classification to far larger flow counts and traffic volume. Moving reinforcement learning to the dataplane is shown to offer substantial benefits to stateaction latency and online learning throughput versus host machines; allowing policies to react faster to fine-grained network events. The dataplane environment is key in making reactive online learning feasible—to port further algorithms and learnt functions, I collate and analyse the strengths of current and future hardware designs, as well as individual algorithms

Glasgow Theses Service

Secure Connectivity With Persistent Identities

Author: Varjonen Samu
Publication venue: 'University of Helsinki Libraries'
Publication date: 14/11/2012
Field of study

In the current Internet the Internet Protocol address is burdened with two roles. It serves as the identifier and the locator for the host. As the host moves its identity changes with its locator. The research community thinks that the Future Internet will include identifier-locator split in some form. Identifier-locator split is seen as the solution to multiple problems. However, identifier-locator split introduces multiple new problems to the Internet. In this dissertation we concentrate on: the feasibility of using identifier-locator split with legacy applications, securing the resolution steps, using the persistent identity for access control, improving mobility in environments using multiple address families and so improving the disruption tolerance for connectivity. The proposed methods achieve theoretical and practical improvements over the earlier state of the art. To raise the overall awareness, our results have been published in interdisciplinary forums.Nykypäivän Internetissä IP-osoite on kuormitettu kahdella eri roolilla. IP toimii päätelaitteen osoitteena, mutta myös usein sen identiteetinä. Tällöin laitteen identiteetti muuttuu laitteen liikkuessa, koska laitteen osoite vaihtuu. Tutkimusyhteisön mielestä paikan ja identiteetin erottaminen on välttämätöntä tulevaisuuden Internetissä. Paikan ja identiteetin erottaminen tuo kuitenkin esiin joukon uusia ongelmia. Tässä väitöskirjassa keskitytään selvittämään paikan ja identiteetin erottamisen vaikutusta olemassa oleviin verkkoa käyttäviin sovelluksiin, turvaamaan nimien muuntaminen osoitteiksi, helpottamaan pitkäikäisten identiteettien käyttöä pääsyvalvonnassa ja parantamaan yhteyksien mahdollisuuksia selviytyä liikkumisesta usean osoiteperheen ympäristöissä. Väitöskirjassa ehdotetut menetelmät saavuttavat sekä teoreettisia että käytännön etuja verrattuna aiempiin kirjallisuudessa esitettyihin menetelmiin. Saavutetut tulokset on julkaistu eri osa-alojen foorumeilla

Helsingin yliopiston digitaalinen arkisto

Recommended from our members

Assessing the security benefits of defence in depth

Author: Algaith A.
Publication venue
Publication date
Field of study

Most modern computer systems are connected to the Internet. This brings many opportunities for revenue generation via e-commerce and information sharing, but also threats due to the exposure of these systems to malicious adversaries. Therefore, almost all organisations deploy security tools to improve overall detection capabilities. However, all security tools have limitations: they may fail to detect attacks, fail to uncover all vulnerabilities or generate alarms for non-malicious traffic or non-vulnerable code. Using terminology from signalling theory, we can state that security tools suffer from two types of failures: failure to correctly label a malicious event as malicious (False Negatives); and failure to correctly label a non-malicious event as non-malicious (False Positive). These failures may vary from one tool to another, since security tools are diverse in their weaknesses as well as their strengths. Therefore, an obvious design paradigm when deploying these defences is Diversity or Defence in Depth: the expectation is that employing multiple tools increases the chance of detecting malicious behaviour. This thesis presents research to assess the benefits (or harm) from using diversity. This thesis begins with a literature review on defence in depth, diversity and fault tolerance while identifying areas for further research. This review is followed by the presentation of the overall methodology that we have used to perform the diversity assessment for three types of defence tools namely AntiVirus (AV) products, Intrusion Detection Systems (IDS) and Static Analysis Tools (SAT). The context of this project is inspired by the EPSRC D3S project in the Centre for Software Reliability (CSR) at the City, University of London as well as the previous work on diversity conducted at the same centre, but also elsewhere in the world. This thesis presents the results using the well-known metrics for binary classifiers: Sensitivity and Specificity; and assesses the various forms of adjudication that may be used: 1-out-of-N (1ooN – raise an alarm as long as ANY of the defences do so), N-out-of-N (NooN – raise an alarm only if ALL the defences do so), majority voting (raise an alarm where a MAJORITY of the defences do so) or optimal adjudication (raise an alarm in such a way that it minimises the overall loss to the system from a failure). The first study compares the detection capabilities of nine different AV products. Additionally, for each vendor, the detection capabilities of the version of the product that is available for free in the VirusTotal platform are compared with the full capability version of that product that is available from the same vendor’s website. Counterintuitively, the free version of AVs from VirusTotal performed better (in most cases) than the commercial versions from the same vendor. The second study compares the detection capabilities of IDS when deployed in a combined configuration. The functionally diverse combinations are shown to increase the true positive rate significantly while experiencing smaller increases in false positive rate. The third study analyses the improvements and deteriorations of using diverse SATs to detect web vulnerabilities. The largest improvements in sensitivity, with the least deterioration in specificity was observed with the 1ooN configurations, in NooN configurations there is an improvement in specificity compared with individual systems, and there is a deterioration in sensitivity. Finally, the benefits of “optimal adjudication” were also investigated: the result shows that the total loss that can result from the two types of failures considered (False Positives and False Negatives) can be significantly reduced with optimal adjudication configurations compared with more conventional methods of adjudication such as 1ooN, NooN or majority voting. In conclusion, using diverse security protection tools is shown to be beneficial to improving the detection capability of three different families of products and optimal adjudication techniques can help balance the benefits of improved detection while lowering the false positive rates

City Research Online