20,367 research outputs found
NEMICO: Mining network data through cloud-based data mining techniques
Thanks to the rapid advances in Internet-based applications, data acquisition and storage technologies, petabyte-sized network data collections are becoming more and more common, thus prompting the need for scalable data analysis solutions. By leveraging today’s ubiquitous many-core computer architectures and the increasingly popular cloud computing paradigm, the applicability of data mining algorithms to these large volumes of network data can be scaled up to gain interesting insights. This paper proposes NEMICO, a comprehensive Big Data mining system targeted to network traffic flow analyses (e.g., traffic flow characterization, anomaly detection, multiplelevel pattern mining). NEMICO comprises new approaches that contribute to a paradigm-shift in distributed data mining by addressing most challenging issues related to Big Data, such as data sparsity, horizontal scaling, and parallel computation
Distributed Network Anomaly Detection on an Event Processing Framework
Network Intrusion Detection Systems (NIDS) are an integral part of modern data centres to ensure high availability and compliance with Service Level Agreements (SLAs). Currently, NIDS are deployed on high-performance, high-cost middleboxes that are responsible for monitoring a limited section of the network. The fast increasing size and aggregate throughput of modern data centre networks have come to challenge the current approach to anomaly detection to satisfy the fast growing compute demand. In this paper, we propose a novel approach to distributed intrusion detection systems based on the architecture of recently proposed event processing frameworks. We have designed and implemented a prototype system using Apache Storm to show the benefits of the proposed approach as well as the architectural differences with traditional systems. Our system distributes modules across the available devices within the network fabric and uses a centralised controller for orchestration, management and correlation. Following the Software Defined Networking (SDN) paradigm, the controller maintains a complete view of the network but distributes the processing logic for quick event processing while performing complex event correlation centrally. We have evaluated the proposed system using publicly available data centre traces and demonstrated that the system can scale with the network topology while providing high performance and minimal impact on packet latency
CONDOR: A Hybrid IDS to Offer Improved Intrusion Detection
Intrusion Detection Systems are an accepted and very
useful option to monitor, and detect malicious activities.
However, Intrusion Detection Systems have inherent limitations which lead to false positives and false negatives; we propose that combining signature and anomaly based IDSs should be examined. This paper contrasts signature and anomaly-based IDSs, and critiques some proposals about hybrid IDSs with signature and heuristic capabilities, before considering some of their contributions in order to include them as main features of a new hybrid IDS named CONDOR (COmbined Network intrusion Detection ORientate), which is designed to offer superior pattern analysis and anomaly detection by reducing false positive rates and administrator intervention
AI Solutions for MDS: Artificial Intelligence Techniques for Misuse Detection and Localisation in Telecommunication Environments
This report considers the application of Articial Intelligence (AI) techniques to
the problem of misuse detection and misuse localisation within telecommunications
environments. A broad survey of techniques is provided, that covers inter alia
rule based systems, model-based systems, case based reasoning, pattern matching,
clustering and feature extraction, articial neural networks, genetic algorithms, arti
cial immune systems, agent based systems, data mining and a variety of hybrid
approaches. The report then considers the central issue of event correlation, that
is at the heart of many misuse detection and localisation systems. The notion of
being able to infer misuse by the correlation of individual temporally distributed
events within a multiple data stream environment is explored, and a range of techniques,
covering model based approaches, `programmed' AI and machine learning
paradigms. It is found that, in general, correlation is best achieved via rule based approaches,
but that these suffer from a number of drawbacks, such as the difculty of
developing and maintaining an appropriate knowledge base, and the lack of ability
to generalise from known misuses to new unseen misuses. Two distinct approaches
are evident. One attempts to encode knowledge of known misuses, typically within
rules, and use this to screen events. This approach cannot generally detect misuses
for which it has not been programmed, i.e. it is prone to issuing false negatives.
The other attempts to `learn' the features of event patterns that constitute normal
behaviour, and, by observing patterns that do not match expected behaviour, detect
when a misuse has occurred. This approach is prone to issuing false positives,
i.e. inferring misuse from innocent patterns of behaviour that the system was not
trained to recognise. Contemporary approaches are seen to favour hybridisation,
often combining detection or localisation mechanisms for both abnormal and normal
behaviour, the former to capture known cases of misuse, the latter to capture
unknown cases. In some systems, these mechanisms even work together to update
each other to increase detection rates and lower false positive rates. It is concluded
that hybridisation offers the most promising future direction, but that a rule or state
based component is likely to remain, being the most natural approach to the correlation
of complex events. The challenge, then, is to mitigate the weaknesses of
canonical programmed systems such that learning, generalisation and adaptation
are more readily facilitated
Fake View Analytics in Online Video Services
Online video-on-demand(VoD) services invariably maintain a view count for
each video they serve, and it has become an important currency for various
stakeholders, from viewers, to content owners, advertizers, and the online
service providers themselves. There is often significant financial incentive to
use a robot (or a botnet) to artificially create fake views. How can we detect
the fake views? Can we detect them (and stop them) using online algorithms as
they occur? What is the extent of fake views with current VoD service
providers? These are the questions we study in the paper. We develop some
algorithms and show that they are quite effective for this problem.Comment: 25 pages, 15 figure
- …