9 research outputs found
Cardinalities estimation under sliding time window by sharing HyperLogLog Counter
Cardinalities estimation is an important research topic in network management
and security. How to solve this problem under sliding time window is a hot
topic. HyperLogLog is a memory efficient algorithm work under a fixed time
window. A sliding version of HyperLogLog can work under sliding time window by
replacing every counter of HyperLogLog with a list of feature possible maxim
(LFPM). But LFPM is a dynamic structure whose size is variable at running time.
This paper proposes a novel counter for HyperLogLog which consumes smaller size
of memory than that of LFPM. Our counter is called bit distance recorder BDR,
because it maintains the distance of every left most "1" bit position. The size
of BDR is fixed. Based on BDR, we design a multi hosts' cardinalities
estimation algorithm under sliding time window, virtual bit distance recorder
VBDR. VBDR allocate a virtual vector of BDR for every host and every physical
BDR is shared by several hosts to improve the memory usage. After a small
modifcation, we propose another two parallel versions of VBDR which can run on
GPU to handle high speed traffic. One of these parallel VBDR is fast in IP pair
scanning and the other one is memory efficient. BDR is also suitable for other
cardinality estimation algorithms such as PCSA, LogLog.Comment: 2 figures. arXiv admin note: text overlap with arXiv:1807.0152
Economical and efficient network super points detection based on GPU
Network super point is a kind of special host which plays an important role
in network management and security. For a core network, detecting super points
in real time is a burden task because it requires plenty computing resources to
keep up with the high speed of packets. Previous works try to solve this
problem by using expensive memory, such as static random access memory, and
multi cores of CPU. But the number of cores in CPU is small and each core of
CPU has a high price. In this work, we use a popular parallel computing
platform, graphic processing unit GPU, to mining core network's super point. We
propose a double direction hash functions group which can map hosts randomly
and restore them from a dense structure. Because the high randomness and simple
process of the double direction hash functions, our algorithm reduce the memory
to smaller than one-fourth of other algorithms. Because the small memory
requirement of our algorithm, a low cost GPU, only worth 200 dollars, is fast
enough to deal with a high speed network such as 750 Gb/s. No other algorithm
can cope with such a high bandwidth traffic as accuracy as our algorithm on
such a cheap platform. Experiments on the traffic collecting from a core
network demonstrate the advantage of our efficient algorithm.Comment: 9 pages, 11 figure
SRLA: A real time sliding time window super point cardinality estimation algorithm for high speed network based on GPU
Super point is a special host in network which communicates with lots of
other hosts in a certain time period. The number of hosts contacting with a
super point is called as its cardinality. Cardinality estimating plays
important roles in network management and security. All of existing works focus
on how to estimate super point's cardinality under discrete time window. But
discrete time window causes great delay and the accuracy of estimating result
is subject to the starting of the window. sliding time window, moving
forwarding a small slice every time, offers a more accuracy and timely scale to
monitor super point's cardinality. On the other hand, super point's cardinality
estimating under sliding time window is more difficult because it requires an
algorithm to record the cardinality incrementally and report them immediately
at the end of the sliding duration. This paper firstly solves this problem by
devising a sliding time window available algorithm SRLA. SRLA records hosts
cardinality by a novel structure which could be updated incrementally. In order
to reduce the cardinality estimating time at the end of every sliding time
window, SRLA generates a super point candidate list while scanning packets and
calculates the cardinality of hosts in the candidate list only. It also has the
ability to run parallel to deal with high speed network in line speed. This
paper gives the way to deploy SRLA on a common GPU. Experiments on real world
traffics which have 40 GB/s bandwidth show that SRLA successfully estimates
super point's cardinality within 100 milliseconds under sliding time window
when running on a low cost Nvidia GPU, GTX650 with 1 GB memory. The estimating
time of SRLA is much smaller than that of other algorithms which consumes more
than 2000 milliseconds under discrete time window.Comment: 11 pages, 11 figure
Most memory efficient distributed super points detection on core networks
The super point, a host which communicates with lots of others, is a kind of
special hosts gotten great focus. Mining super point at the edge of a network
is the foundation of many network research fields. In this paper, we proposed
the most memory efficient super points detection scheme. This scheme contains a
super points reconstruction algorithm called short estimator and a super points
filter algorithm called long estimator. Short estimator gives a super points
candidate list using thousands of bytes memory and long estimator improves the
accuracy of detection result using millions of bytes memory. Combining short
estimator and long estimator, our scheme acquires the highest accuracy using
the smallest memory than other algorithms. There is no data conflict and
floating operation in our scheme. This ensures that our scheme is suitable for
parallel running and we deploy our scheme on a common GPU to accelerate
processing speed. We also describe how to extend our algorithm to sliding time.
Experiments on several real-world core network traffics show that our algorithm
acquires the highest accuracy with only consuming littler than one-fifth memory
of other algorithms
Regain Sliding super point from distributed edge routers by GPU
Sliding super point is a special host defined under sliding time window with
which there are huge other hosts contact. It plays important roles in network
security and management. But how to detect them in real time from nowadays
high-speed network which contains several distributed routers is a hard task.
Distributed sliding super point detection requires an algorithm that can
estimate the number of contacting hosts incrementally, scan packets faster than
their flowing speed and reconstruct sliding super point at the end of a time
period. But no existing algorithm satisfies these three requirements
simultaneously. To solve this problem, this paper firstly proposed a
distributed sliding super point detection algorithm running on GPU. The
advantage of this algorithm comes from a novel sliding estimator, which can
estimate contacting host number incrementally under a sliding window, and a set
of reversible hash functions, by which sliding super points could be regained
without storing additional data such as IP list. There are two main procedures
in this algorithm: packets scanning and sliding super points reconstruction.
Both could run parallel without any data reading conflict. When deployed on a
low cost GPU, this algorithm could deal with traffic with bandwidth as high as
680 Gb/s. A real world core network traffic is used to evaluate the performance
of this sliding super point detection algorithm on a cheap GPU, Nvidia GTX950
with 4 GB graphic memory. Experiments comparing with other algorithms under
discrete time window show that this algorithm has the highest accuracy. Under
sliding time widow, this algorithm has the same performance as in discrete time
window, where no other algorithms can work.Comment: 11 pages, 10 figure
GPU based Real-time Super Hosts Detection at Distributed Edge Routers
The super host is a special host on the network which contacts with many
other hosts during a certain time window. They play important roles in network
researches such as scanners detection, resource allocation, spam filtering and
so on. How to find super hosts in real time is the foundation of these
applications. In this paper, a novel algorithm, denoted as CBAA, is proposed to
solve this problem at edge routers. CBAA divides network traffic into different
parts. A cube of bits array is devised to store hosts' linking information of
different traffic parts when scanning packets. At the end of each time window,
CBAA restores super hosts very fast because there are only a fraction of super
hosts in each traffic part. CBAA is also a parallel algorithm. It's easy to
deploy CBAA in GPU to deal with high-speed network traffic in real time.
Experiments on a real-world core network prove the advantage of our algorithm
Memory efficient distributed sliding super point cardinality estimation by GPU
Super point is a kind of special host in the network which contacts with huge
of other hosts. Estimating its cardinality, the number of other hosts
contacting with it, plays important roles in network management. But all of
existing works focus on discrete time window super point cardinality estimation
which has great latency and ignores many measuring periods. Sliding time window
measures super point cardinality in a finer granularity than that of discrete
time window but also more complex. This paper firstly introduces an algorithm
to estimate super point cardinality under sliding time window from distributed
edge routers. This algorithm's ability of sliding super point cardinality
estimating comes from a novel method proposed in this paper which can record
the time that a host appears. Based on this method, two sliding cardinality
estimators, sliding rough estimator and sliding linear estimator, are devised
for super points detection and their cardinalities estimation separately. When
using these two estimators together, the algorithm consumes the smallest memory
with the highest accuracy. This sliding super point cardinality algorithm can
be deployed in distributed environment and acquire the global super points'
cardinality by merging estimators of distributed nodes. Both of these
estimators could process packets parallel which makes it becom possible to deal
with high speed network in real time by GPU. Experiments on a real world
traffic show that this algorithm have the highest accuracy and the smallest
memory comparing with others when running under discrete time window. Under
sliding time window, this algorithm also has the same performance as under
discrete time window.Comment: arXiv admin note: substantial text overlap with arXiv:1803.1103
Distributed super point cardinality estimation under sliding time window for high speed network
Super point is a special kind of host whose cardinality, the number of
contacting hosts in a certain period, is bigger than a threshold. Super point
cardinality estimation plays important roles in network field. This paper
proposes a super point cardinality estimation algorithm under sliding time
window. To maintain the state of previous hosts with few updating operations, a
novel counter, asynchronous time stamp (AT), is proposed. For a sliding time
window containing k time slices, AT only needs to be updated every k time
slices at the cost of 1 more bit than a previous state-of-art counter which
requires bits but updates every time slice. Fewer updating
operations mean that more AT could be contained to acquire higher accuracy in
real-time. This paper also devises a novel reversible hash function scheme to
restore super point from a pool of AT. Experiments on several real-world
network traffic illustrate that the algorithm proposed in this paper could
detect super points and estimate their cardinalities under sliding time window
in real time.Comment: 13 page
VATE: a trade-off between memory and preserving time for high accuracy cardinalities estimation under sliding time window
Host cardinality is one of the important attributes in the field of network
research. The cardinality estimation under sliding time window has become a
research hotspot in recent years because of its high accuracy and small delay.
This kind of algorithms preserve the time information of sliding time window by
introducing more powerful counters. The more counters used in these algorithms,
the higher the estimation accuracy of these algorithms. However, the available
number of sliding counters is limited due to their large memory footprint or
long state-maintenance time. To solve this problem, a new sliding counter,
asynchronous timestamp (AT), is designed in this paper which has the advantages
of less memory consumption and low state-maintenance time. AT can replace
counters in existing algorithms. On the same device, more AT can be used to
achieve higher accuracy. Based on AT, this paper designs a new multi-hosts
cardinalities estimation algorithm VATE. VATE is also a parallel algorithm that
can be deployed on GPU. With the parallel processing capability of GPU, VATE
can estimate cardinalities of hosts in a 40 Gb/s high-speed network in real
time at the time granularity of 1 second