187 research outputs found
Anomaly Detection in Cloud-Native systems
In recent years, microservices have gained popularity due to their benefits such as increased maintainability and scalability of the system. The microservice architectural pattern was adopted for the development of a large scale system which is commonly deployed on public and private clouds, and therefore the aim is to ensure that it always maintains an optimal level of performance. Consequently, the system is monitored by collecting different metrics including performancerelated metrics.
The first part of this thesis focuses on the creation of a dataset of realistic time series with anomalies at deterministic locations. This dataset addresses the lack of labeled data for training of supervised models and the absence of publicly available data, in fact the data are not usually shared due to privacy concerns.
The second part consists of an empirical study on the detection of anomalies occurring in the different services that compose the system. Specifically, the aim is to understand if it is possible to predict the anomalies in order to perform actions before system failures or performance degradation. Consequently, eight different classification-based Machine Learning algorithms were compared by collecting accuracy, training time and testing time, to figure out which technique might be most suitable for reducing system overload.
The results showed that there are strong correlations between metrics and that it is possible to predict the anomalies in the system with approximately 90% of accuracy. The most important outcome is that performance-related anomalies can be detected by monitoring a limited number of metrics collected at runtime with a short training time. Future work includes the adoption of prediction-based approaches and the development of some tools for the prediction of anomalies in cloud native environments
Monitoring and analysis system for performance troubleshooting in data centers
It was not long ago. On Christmas Eve 2012, a war of troubleshooting began in Amazon data centers. It started at 12:24 PM, with an mistaken deletion of the state data of Amazon Elastic Load Balancing Service (ELB for short), which was
not realized at that time. The mistake first led to a local issue that a small number of ELB service APIs were affected. In about six minutes, it evolved into a critical one that EC2 customers were significantly affected. One example was that Netflix, which was using hundreds of Amazon ELB services, was experiencing an extensive streaming service outage when many customers could not watch TV shows or movies on Christmas Eve. It took Amazon engineers 5 hours 42 minutes to find the root cause, the mistaken deletion, and another 15 hours and 32 minutes to fully recover the ELB service. The war ended at 8:15 AM the next day and brought the performance
troubleshooting in data centers to world’s attention. As shown in this Amazon ELB case.Troubleshooting runtime performance issues is crucial in time-sensitive multi-tier cloud services because of their stringent end-to-end timing requirements, but it is also notoriously difficult and time consuming.
To address the troubleshooting challenge, this dissertation proposes VScope, a flexible monitoring and analysis system for online troubleshooting in data centers.
VScope provides primitive operations which data center operators can use to troubleshoot various performance issues. Each operation is essentially a series of monitoring and analysis functions executed on an overlay network. We design a novel
software architecture for VScope so that the overlay networks can be generated, executed and terminated automatically, on-demand. From the troubleshooting side, we design novel anomaly detection algorithms and implement them in VScope. By
running anomaly detection algorithms in VScope, data center operators are notified when performance anomalies happen. We also design a graph-based guidance approach, called VFocus, which tracks the interactions among hardware and software components in data centers. VFocus provides primitive operations by which operators can analyze the interactions to find out which components are relevant to the
performance issue.
VScope’s capabilities and performance are evaluated on a testbed with over 1000 virtual machines (VMs). Experimental results show that the VScope runtime negligibly perturbs system and application performance, and requires mere seconds to deploy monitoring and analytics functions on over 1000 nodes. This demonstrates VScope’s ability to support fast operation and online queries against a comprehensive set of application to system/platform level metrics, and a variety of representative analytics functions. When supporting algorithms with high computation complexity, VScope serves as a ‘thin layer’ that occupies no more than 5% of their total latency. Further, by using VFocus, VScope can locate problematic VMs that cannot be found
via solely application-level monitoring, and in one of the use cases explored in the dissertation, it operates with levels of perturbation of over 400% less than what is seen for brute-force and most sampling-based approaches. We also validate VFocus
with real-world data center traces. The experimental results show that VFocus has troubleshooting accuracy of 83% on average.Ph.D
Mecanismos para controlo e gestão de redes 5G: redes de operador
In 5G networks, time-series data will be omnipresent for the monitoring of network
metrics. With the increase in the number of Internet of Things (IoT) devices
in the next years, it is expected that the number of real-time time-series
data streams increases at a fast pace. To be able to monitor those streams,
test and correlate different algorithms and metrics simultaneously and in a
seamless way, time-series forecasting is becoming essential for the pro-active
successful management of the network.
The objective of this dissertation is to design, implement and test a prediction
system in a communication network, that allows integrating various networks,
such as a vehicular network and a 4G operator network, to improve the network
reliability and Quality-of-Service (QoS). To do that, the dissertation has
three main goals: (1) the analysis of different network datasets and implementation
of different approaches to forecast network metrics, to test different
techniques; (2) the design and implementation of a real-time distributed
time-series forecasting architecture, to enable the network operator to make
predictions about the network metrics; and lastly, (3) to use the forecasting
models made previously and apply them to improve the network performance
using resource management policies.
The tests done with two different datasets, addressing the use cases of congestion
management and resource splitting in a network with a limited number
of resources, show that the network performance can be improved with proactive
management made by a real-time system able to predict the network
metrics and act on the network accordingly.
It is also done a study about what network metrics can cause reduced accessibility
in 4G networks, for the network operator to act more efficiently and
pro-actively to avoid such eventsEm redes 5G, séries temporais serão omnipresentes para a monitorização
de métricas de rede. Com o aumento do número de dispositivos da Internet
das Coisas (IoT) nos próximos anos, é esperado que o número de fluxos de
séries temporais em tempo real cresça a um ritmo elevado. Para monitorizar
esses fluxos, testar e correlacionar diferentes algoritmos e métricas simultaneamente
e de maneira integrada, a previsão de séries temporais está a
tornar-se essencial para a gestão preventiva bem sucedida da rede.
O objetivo desta dissertação é desenhar, implementar e testar um sistema
de previsão numa rede de comunicações, que permite integrar várias redes
diferentes, como por exemplo uma rede veicular e uma rede 4G de operador,
para melhorar a fiabilidade e a qualidade de serviço (QoS). Para isso,
a dissertação tem três objetivos principais: (1) a análise de diferentes datasets
de rede e subsequente implementação de diferentes abordagens para
previsão de métricas de rede, para testar diferentes técnicas; (2) o desenho
e implementação de uma arquitetura distribuÃda de previsão de séries temporais
em tempo real, para permitir ao operador de rede efetuar previsões
sobre as métricas de rede; e finalmente, (3) o uso de modelos de previsão
criados anteriormente e sua aplicação para melhorar o desempenho da rede
utilizando polÃticas de gestão de recursos.
Os testes efetuados com dois datasets diferentes, endereçando os casos de
uso de gestão de congestionamento e divisão de recursos numa rede com
recursos limitados, mostram que o desempenho da rede pode ser melhorado
com gestão preventiva da rede efetuada por um sistema em tempo real capaz
de prever métricas de rede e atuar em conformidade na rede.
Também é efetuado um estudo sobre que métricas de rede podem causar
reduzida acessibilidade em redes 4G, para o operador de rede atuar mais
eficazmente e proativamente para evitar tais acontecimentos.Mestrado em Engenharia de Computadores e Telemátic
A Business Intelligence Solution, based on a Big Data Architecture, for processing and analyzing the World Bank data
The rapid growth in data volume and complexity has needed the adoption of advanced technologies to extract valuable insights for decision-making. This project aims to address this need by developing a comprehensive framework that combines Big Data processing, analytics, and visualization techniques to enable effective analysis of World Bank data. The problem addressed in this study is the need for a scalable and efficient Business Intelligence solution that can handle the vast amounts of data generated by the World Bank. Therefore, a Big Data architecture is implemented on a real use case for the International Bank of Reconstruction and Development. The findings of this project demonstrate the effectiveness of the proposed solution. Through the integration of Apache Spark and Apache Hive, data is processed using Extract, Transform and Load techniques, allowing for efficient data preparation. The use of Apache Kylin enables the construction of a multidimensional model, facilitating fast and interactive queries on the data. Moreover, data visualization techniques are employed to create intuitive and informative visual representations of the analysed data. The key conclusions drawn from this project highlight the advantages of a Big Data-driven Business Intelligence solution in processing and analysing World Bank data. The implemented framework showcases improved scalability, performance, and flexibility compared to traditional approaches. In conclusion, this bachelor thesis presents a Business Intelligence solution based on a Big Data architecture for processing and analysing the World Bank data. The project findings emphasize the importance of scalable and efficient data processing techniques, multidimensional modelling, and data visualization for deriving valuable insights. The application of these techniques contributes to the field by demonstrating the potential of Big Data Business Intelligence solutions in addressing the challenges associated with large-scale data analysis
Classification algorithms for Big Data with applications in the urban security domain
A classification algorithm is a versatile tool, that can serve as a predictor for the
future or as an analytical tool to understand the past. Several obstacles prevent
classification from scaling to a large Volume, Velocity, Variety or Value. The aim
of this thesis is to scale distributed classification algorithms beyond current limits,
assess the state-of-practice of Big Data machine learning frameworks and validate
the effectiveness of a data science process in improving urban safety.
We found in massive datasets with a number of large-domain categorical features
a difficult challenge for existing classification algorithms. We propose associative
classification as a possible answer, and develop several novel techniques to distribute
the training of an associative classifier among parallel workers and improve the final
quality of the model. The experiments, run on a real large-scale dataset with more
than 4 billion records, confirmed the quality of the approach.
To assess the state-of-practice of Big Data machine learning frameworks and
streamline the process of integration and fine-tuning of the building blocks, we
developed a generic, self-tuning tool to extract knowledge from network traffic
measurements. The result is a system that offers human-readable models of the data
with minimal user intervention, validated by experiments on large collections of
real-world passive network measurements.
A good portion of this dissertation is dedicated to the study of a data science
process to improve urban safety. First, we shed some light on the feasibility of a
system to monitor social messages from a city for emergency relief. We then propose
a methodology to mine temporal patterns in social issues, like crimes. Finally,
we propose a system to integrate the findings of Data Science on the citizenry’s
perception of safety and communicate its results to decision makers in a timely
manner. We applied and tested the system in a real Smart City scenario, set in Turin,
Italy
Design and Implementation of Information Warehouse for Manufacturing Facility Supporting Holistic Energy Management
Energy management is one of the most critical tasks, which needs to be performed in manufacturing facility, since manufacturing consumes 1/3 of world’s energy. Same time, manufacturing facilities are equipped with large amounts of field devices, which generate vast amounts of information every second. With such a huge amount of real-time data, which has a potential to provide insight information for energy management needs, capturing, storing and processing of it becomes a challenge. In this thesis an information warehouse system supporting holistic energy management is designed and implemented. The main goal is to provide a system, which can capture, store and provide information relevant for energy management purposes in manufacturing facility.
The thesis consists of three main parts. In the first part current and most relevant for energy management concepts and technologies, including Big Data, NoSQL, Service Oriented Architecture and Complex Event Processing, are explored, analyzed and compared. In the second part an architectural design of information warehouse is presented. During this step a set of tools and technologies is selected for implementation. In a third part, an information warehouse system is implemented and tested in a manufacturing line test-bed.
Implemented information warehouse is based on multi-layered architectural pattern, where layers are communicating with each other via services. The most important advantage of this modular architecture is an ability to use implemented solution in any manufacturing facility, as modules can be easily reconfigured in order to adjust to different context. The designed information warehouse system was tested for a manufacturing line located in premises of Tampere University of Technology. The results of this thesis demonstrate that the developed information warehouse system is capable of collection, processing and providing access to crucial for energy management information
AI-native Interconnect Framework for Integration of Large Language Model Technologies in 6G Systems
The evolution towards 6G architecture promises a transformative shift in
communication networks, with artificial intelligence (AI) playing a pivotal
role. This paper delves deep into the seamless integration of Large Language
Models (LLMs) and Generalized Pretrained Transformers (GPT) within 6G systems.
Their ability to grasp intent, strategize, and execute intricate commands will
be pivotal in redefining network functionalities and interactions. Central to
this is the AI Interconnect framework, intricately woven to facilitate
AI-centric operations within the network. Building on the continuously evolving
current state-of-the-art, we present a new architectural perspective for the
upcoming generation of mobile networks. Here, LLMs and GPTs will
collaboratively take center stage alongside traditional pre-generative AI and
machine learning (ML) algorithms. This union promises a novel confluence of the
old and new, melding tried-and-tested methods with transformative AI
technologies. Along with providing a conceptual overview of this evolution, we
delve into the nuances of practical applications arising from such an
integration. Through this paper, we envisage a symbiotic integration where AI
becomes the cornerstone of the next-generation communication paradigm, offering
insights into the structural and functional facets of an AI-native 6G network
Anomaly Detection in Time Series: Current Focus and Future Challenges
Anomaly detection in time series has become an increasingly vital task, with applications such as fraud detection and intrusion monitoring. Tackling this problem requires an array of approaches, including statistical analysis, machine learning, and deep learning. Various techniques have been proposed to cater to the complexity of this problem. However, there are still numerous challenges in the field concerning how best to process high-dimensional and complex data streams in real time. This chapter offers insight into the cutting-edge models for anomaly detection in time series. Several of the models are discussed and their advantages and disadvantages are explored. We also look at new areas of research that are being explored by researchers today as their current focuses and how those new models or techniques are being implemented in them as they try to solve unique problems posed by complex data, high-volume data streams, and a need for real-time processing. These research areas will provide concrete examples of the applications of discussed models. Lastly, we identify some of the current issues and suggest future directions for research concerning anomaly detection systems. We aim to provide readers with a comprehensive picture of what is already out there so they can better understand the space – preparing them for further development within this growing field
- …