2,176 research outputs found
On the Efficacy of Live DDoS Detection with Hadoop
Distributed Denial of Service flooding attacks are one of the biggest
challenges to the availability of online services today. These DDoS attacks
overwhelm the victim with huge volume of traffic and render it incapable of
performing normal communication or crashes it completely. If there are delays
in detecting the flooding attacks, nothing much can be done except to manually
disconnect the victim and fix the problem. With the rapid increase of DDoS
volume and frequency, the current DDoS detection technologies are challenged to
deal with huge attack volume in reasonable and affordable response time.
In this paper, we propose HADEC, a Hadoop based Live DDoS Detection framework
to tackle efficient analysis of flooding attacks by harnessing MapReduce and
HDFS. We implemented a counter-based DDoS detection algorithm for four major
flooding attacks (TCP-SYN, HTTP GET, UDP and ICMP) in MapReduce, consisting of
map and reduce functions. We deployed a testbed to evaluate the performance of
HADEC framework for live DDoS detection. Based on the experiments we showed
that HADEC is capable of processing and detecting DDoS attacks in affordable
time
Topic Similarity Networks: Visual Analytics for Large Document Sets
We investigate ways in which to improve the interpretability of LDA topic
models by better analyzing and visualizing their outputs. We focus on examining
what we refer to as topic similarity networks: graphs in which nodes represent
latent topics in text collections and links represent similarity among topics.
We describe efficient and effective approaches to both building and labeling
such networks. Visualizations of topic models based on these networks are shown
to be a powerful means of exploring, characterizing, and summarizing large
collections of unstructured text documents. They help to "tease out"
non-obvious connections among different sets of documents and provide insights
into how topics form larger themes. We demonstrate the efficacy and
practicality of these approaches through two case studies: 1) NSF grants for
basic research spanning a 14 year period and 2) the entire English portion of
Wikipedia.Comment: 9 pages; 2014 IEEE International Conference on Big Data (IEEE BigData
2014
Garbage collection auto-tuning for Java MapReduce on Multi-Cores
MapReduce has been widely accepted as a simple programming pattern that can form the basis for efficient, large-scale, distributed data processing. The success of the MapReduce pattern has led to a variety of implementations for different computational scenarios. In this paper we present MRJ, a MapReduce Java framework for multi-core architectures. We evaluate its scalability on a four-core, hyperthreaded Intel Core i7 processor, using a set of standard MapReduce benchmarks. We investigate the significant impact that Java runtime garbage collection has on the performance and scalability of MRJ. We propose the use of memory management auto-tuning techniques based on machine learning. With our auto-tuning approach, we are able to achieve MRJ performance within 10% of optimal on 75% of our benchmark tests
- ā¦