16 research outputs found
Big Data for Traffic Monitoring and Management
The last two decades witnessed tremendous advances in the Information and
Communications Technologies. Beside improvements in computational power and
storage capacity, communication networks carry nowadays an amount of data which
was not envisaged only few years ago. Together with their pervasiveness,
network complexity increased at the same pace, leaving operators and
researchers with few instruments to understand what happens in the networks,
and, on the global scale, on the Internet. Fortunately, recent advances in data
science and machine learning come to the rescue of network analysts, and allow
analyses with a level of complexity and spatial/temporal scope not possible
only 10 years ago. In my thesis, I take the perspective of an Internet Service
Provider (ISP), and illustrate challenges and possibilities of analyzing the
traffic coming from modern operational networks. I make use of big data and
machine learning algorithms, and apply them to datasets coming from passive
measurements of ISP and University Campus networks. The marriage between data
science and network measurements is complicated by the complexity of machine
learning algorithms, and by the intrinsic multi-dimensionality and variability
of this kind of data. As such, my work proposes and evaluates novel techniques,
inspired from popular machine learning approaches, but carefully tailored to
operate with network traffic
Big Data for Traffic Monitoring and Management
The last two decades witnessed tremendous advances in the Information and Com-
munications Technologies. Beside improvements in computational power and storage
capacity, communication networks carry nowadays an amount of data which was not
envisaged only few years ago. Together with their pervasiveness, network complexity
increased at the same pace, leaving operators and researchers with few instruments to
understand what happens in the networks, and, on the global scale, on the Internet.
Fortunately, recent advances in data science and machine learning come to the res-
cue of network analysts, and allow analyses with a level of complexity and spatial/tem-
poral scope not possible only 10 years ago. In my thesis, I take the perspective of an In-
ternet Service Provider (ISP), and illustrate challenges and possibilities of analyzing the
traffic coming from modern operational networks. I make use of big data and machine
learning algorithms, and apply them to datasets coming from passive measurements of
ISP and University Campus networks. The marriage between data science and network
measurements is complicated by the complexity of machine learning algorithms, and
by the intrinsic multi-dimensionality and variability of this kind of data. As such, my
work proposes and evaluates novel techniques, inspired from popular machine learning
approaches, but carefully tailored to operate with network traffic.
In this thesis, I first provide a thorough characterization of the Internet traffic from
2013 to 2018. I show the most important trends in the composition of traffic and users’
habits across the last 5 years, and describe how the network infrastructure of Internet
big players changed in order to support faster and larger traffic. Then, I show the chal-
lenges in classifying network traffic, with particular attention to encryption and to the
convergence of Internet around few big players. To overcome the limitations of classical
approaches, I propose novel algorithms for traffic classification and management lever-
aging machine learning techniques, and, in particular, big data approaches. Exploiting
temporal correlation among network events, and benefiting from large datasets of op-
erational traffic, my algorithms learn common traffic patterns of web services, and use
them for (i) traffic classification and (ii) fine-grained traffic management. My proposals
are always validated in experimental environments, and, then, deployed in real opera-
tional networks, from which I report the most interesting findings I obtain. I also focus
on the Quality of Experience (QoE) of web users, as their satisfaction represents the
final objective of computer networks. Again, I show that using big data approaches, the
network can achieve visibility on the quality of web browsing of users. In general, the
algorithms I propose help ISPs have a detailed view of traffic that flows in their network,
allowing fine-grained traffic classification and management, and real-time monitoring
of users QoE
SUTMS - Unified Threat Management Framework for Home Networks
Home networks were initially designed for web browsing and non-business critical applications. As infrastructure improved, internet broadband costs decreased, and home internet usage transferred to e-commerce and business-critical applications. Today’s home computers host personnel identifiable information and financial data and act as a bridge to corporate networks via remote access technologies like VPN. The expansion of remote work and the transition to cloud computing have broadened the attack surface for potential threats. Home networks have become the extension of critical networks and services, hackers can get access to corporate data by compromising devices attacked to broad- band routers. All these challenges depict the importance of home-based Unified Threat Management (UTM) systems. There is a need of unified threat management framework that is developed specifically for home and small networks to address emerging security challenges. In this research, the proposed Smart Unified Threat Management (SUTMS) framework serves as a comprehensive solution for implementing home network security, incorporating firewall, anti-bot, intrusion detection, and anomaly detection engines into a unified system. SUTMS is able to provide 99.99% accuracy with 56.83% memory improvements. IPS stands out as the most resource-intensive UTM service, SUTMS successfully reduces the performance overhead of IDS by integrating it with the flow detection mod- ule. The artifact employs flow analysis to identify network anomalies and categorizes encrypted traffic according to its abnormalities. SUTMS can be scaled by introducing optional functions, i.e., routing and smart logging (utilizing Apriori algorithms). The research also tackles one of the limitations identified by SUTMS through the introduction of a second artifact called Secure Centralized Management System (SCMS). SCMS is a lightweight asset management platform with built-in security intelligence that can seamlessly integrate with a cloud for real-time updates
Recommended from our members
Mining software repositories to determine the impact of team factors on the structural attributes of software
This thesis was submitted for the award of PhD and was awarded by Brunel University LondonSoftware development is intrinsically a human activity and the role of the development team has been established as among the most decisive of all project success factors. Prior research has proven empirically that team size and stability are linked to stakeholder satisfaction, team productivity and fault-proneness. Team size is usually considered a measure of the number of developers that modify the source code of a project while team stability is typically a function of the cumulative time that each team member has worked with their fellow team members. There is, however, limited research investigating the impact of these factors on software maintainability - a crucial aspect given that up to 80% of development budgets are consumed in the maintenance phase of the lifecycle. This research sheds light on how these aspects of team composition influence the structural attributes of the developed software that, in turn, drive the maintenance costs of software. This thesis asserts that new and broader insights can be gained by measuring these internal attributes of the software rather than the more traditional approach of measuring its external attributes. This can also enable practitioners to measure and monitor key indicators throughout the development lifecycle taking remedial action where appropriate. Within this research the GoogleCode open-source forge is mined and a sample of 1,480 Java projects are selected for further study. Using the Chidamber and Kemerer design metrics suite, the impact of development team size and stability on the internal structural attributes of software is isolated and quantified. Drawing on prior research correlating these internal attributes with external attributes, the impact on maintainability is deduced. This research finds that those structural attributes that have been established to correlate to fault-proneness - coupling, cohesion and modularity - show degradation as team sizes increase or team stability decreases. That degradation in the internal attributes of the software is associated with a deterioration in the sub-attributes of maintainability; changeability, understandability, testability and stability
Strengthening Privacy and Cybersecurity through Anonymization and Big Data
L'abstract è presente nell'allegato / the abstract is in the attachmen
The Technological Emergence of AutoML: A Survey of Performant Software and Applications in the Context of Industry
With most technical fields, there exists a delay between fundamental academic
research and practical industrial uptake. Whilst some sciences have robust and
well-established processes for commercialisation, such as the pharmaceutical
practice of regimented drug trials, other fields face transitory periods in
which fundamental academic advancements diffuse gradually into the space of
commerce and industry. For the still relatively young field of
Automated/Autonomous Machine Learning (AutoML/AutonoML), that transitory period
is under way, spurred on by a burgeoning interest from broader society. Yet, to
date, little research has been undertaken to assess the current state of this
dissemination and its uptake. Thus, this review makes two primary contributions
to knowledge around this topic. Firstly, it provides the most up-to-date and
comprehensive survey of existing AutoML tools, both open-source and commercial.
Secondly, it motivates and outlines a framework for assessing whether an AutoML
solution designed for real-world application is 'performant'; this framework
extends beyond the limitations of typical academic criteria, considering a
variety of stakeholder needs and the human-computer interactions required to
service them. Thus, additionally supported by an extensive assessment and
comparison of academic and commercial case-studies, this review evaluates
mainstream engagement with AutoML in the early 2020s, identifying obstacles and
opportunities for accelerating future uptake
Jornadas Nacionales de Investigación en Ciberseguridad: actas de las VIII Jornadas Nacionales de Investigación en ciberseguridad: Vigo, 21 a 23 de junio de 2023
Jornadas Nacionales de Investigación en Ciberseguridad (8ª. 2023. Vigo)atlanTTicAMTEGA: Axencia para a modernización tecnolóxica de GaliciaINCIBE: Instituto Nacional de Cibersegurida