2,324 research outputs found
Automatic Detection of Mass Outages in Radio Access Networks
Fault management in mobile networks is required for detecting, analysing, and fixing problems appearing in the mobile network. When a large problem appears in the mobile network, multiple alarms are generated from the network elements. Traditionally Network Operations Center (NOC) process the reported failures, create trouble tickets for problems, and perform a root cause analysis. However, alarms do not reveal the root cause of the failure, and the correlation of alarms is often complicated to determine. If the network operator can correlate alarms and manage clustered groups of alarms instead of separate ones, it saves costs, preserves the availability of the mobile network, and improves the quality of service. Operators may have several electricity providers and the network topology is not correlated with the electricity topology. Additionally, network sites and other network elements are not evenly distributed across the network. Hence, we investigate the suitability of a density-based clustering methods to detect mass outages and perform alarm correlation to reduce the amount of created trouble tickets. This thesis focuses on assisting the root cause analysis and detecting correlated power and transmission failures in the mobile network. We implement a Mass Outage Detection Service and form a custom density-based algorithm. Our service performs alarm correlation and creates clusters of possible power and transmission mass outage alarms. We have filed a patent application based on the work done in this thesis. Our results show that we are able to detect mass outages in real time from the data streams. The results also show that detected clusters reduce the number of created trouble tickets and help reduce of the costs of running the network. The number of trouble tickets decreases by 4.7-9.3% for the alarms we process in the service in the tested networks. When we consider only alarms included in the mass outage groups, the reduction is over 75%. Therefore continuing to use, test, and develop implemented Mass Outage Detection Service is beneficial for operators and automated NOC
AI Solutions for MDS: Artificial Intelligence Techniques for Misuse Detection and Localisation in Telecommunication Environments
This report considers the application of Articial Intelligence (AI) techniques to
the problem of misuse detection and misuse localisation within telecommunications
environments. A broad survey of techniques is provided, that covers inter alia
rule based systems, model-based systems, case based reasoning, pattern matching,
clustering and feature extraction, articial neural networks, genetic algorithms, arti
cial immune systems, agent based systems, data mining and a variety of hybrid
approaches. The report then considers the central issue of event correlation, that
is at the heart of many misuse detection and localisation systems. The notion of
being able to infer misuse by the correlation of individual temporally distributed
events within a multiple data stream environment is explored, and a range of techniques,
covering model based approaches, `programmed' AI and machine learning
paradigms. It is found that, in general, correlation is best achieved via rule based approaches,
but that these suffer from a number of drawbacks, such as the difculty of
developing and maintaining an appropriate knowledge base, and the lack of ability
to generalise from known misuses to new unseen misuses. Two distinct approaches
are evident. One attempts to encode knowledge of known misuses, typically within
rules, and use this to screen events. This approach cannot generally detect misuses
for which it has not been programmed, i.e. it is prone to issuing false negatives.
The other attempts to `learn' the features of event patterns that constitute normal
behaviour, and, by observing patterns that do not match expected behaviour, detect
when a misuse has occurred. This approach is prone to issuing false positives,
i.e. inferring misuse from innocent patterns of behaviour that the system was not
trained to recognise. Contemporary approaches are seen to favour hybridisation,
often combining detection or localisation mechanisms for both abnormal and normal
behaviour, the former to capture known cases of misuse, the latter to capture
unknown cases. In some systems, these mechanisms even work together to update
each other to increase detection rates and lower false positive rates. It is concluded
that hybridisation offers the most promising future direction, but that a rule or state
based component is likely to remain, being the most natural approach to the correlation
of complex events. The challenge, then, is to mitigate the weaknesses of
canonical programmed systems such that learning, generalisation and adaptation
are more readily facilitated
ALACA: A platform for dynamic alarm collection and alert notification in network management systems
Mobile network operators run Operations Support Systems that produce vast amounts of alarm events. These events can have different significance levels and domains and also can trigger other ones. Network operators face the challenge to identify the significance and root causes of these system problems in real time and to keep the number of remedial actions at an optimal level, so that customer satisfaction rates can be guaranteed at a reasonable cost. In this paper, we propose a scalable streaming alarm management system, referred to as Alarm Collector and Analyzer, that includes complex event processing and root cause analysis. We describe a rule mining and root cause analysis solution for alarm event correlation and analyses. The solution includes a dynamic index for matching active alarms, an algorithm for generating candidate alarm rules, a sliding window–based approach to save system resources, and a graph-based solution to identify root causes. Alarm Collector and Analyzer is used in the network operation center of a major mobile telecom provider. It helps operators to enhance the design of their alarm management systems by allowing continuous analysis of data and event streams and predict network behavior with respect to potential failures by using the results of root cause analysis. We present experimental results that provide insights on performance of real-time alarm data analytics systems. Copyright © 2017 John Wiley & Sons, Ltd
- …