2 research outputs found

    Flow Fair Sampling Based on Multistage Bloom Filters

    Get PDF
    Network traffic distribution is heavy-tailed. Most of network flows are short and carry very few packets, and the number of large flows is small. Traditional random sampling tends to sample more large flows than short ones. However, many applications depend on per-flow traffic other than just large flows. A flow fair sampling based on multistage Bloom filters is proposed. The total measurement interval is divided into n child time intervals. In each child time interval, employ multistage Bloom filters to query the incoming packet’s flow whether exists in flow information table or not, if exists, sample the packet with static sampling rate which is inversely proportional to the estimation flow traffic up to the previous time interval. If it is a new flow’s first packet, create its flow information and insert it into the multistage Bloom filters. The results show that the proposed algorithm is accurate especially for short flows and easy to extend

    Efficient Identification of TOP-K Heavy Hitters over Sliding Windows

    Get PDF
    This is the author accepted manuscript. The final version is available from Springer Verlag via the DOI in this recordDue to the increasing volume of network traffic and growing complexity of network environment, rapid identification of heavy hitters is quite challenging. To deal with the massive data streams in real-time, accurate and scalable solution is required. The traditional method to keep an individual counter for each host in the whole data streams is very resource-consuming. This paper presents a new data structure called FCM and its associated algorithms. FCM combines the count-min sketch with the stream-summary structure simultaneously for efficient TOP-K heavy hitter identification in one pass. The key point of this algorithm is that it introduces a novel filter-and-jump mechanism. Given that the Internet traffic has the property of being heavy-tailed and hosts of low frequencies account for the majority of the IP addresses, FCM periodically filters the mice from input streams to efficiently improve the accuracy of TOP-K heavy hitter identification. On the other hand, considering that abnormal events are always time sensitive, our algorithm works by adjusting its measurement window to the newly arrived elements in the data streams automatically. Our experimental results demonstrate that the performance of FCM is superior to the previous related algorithm. Additionally this solution has a good prospect of application in advanced network environment.Chinese Academy of SciencesNational Natural Science Foundation of Chin
    corecore