2,042 research outputs found

    A Framework for Genetic Algorithms Based on Hadoop

    Full text link
    Genetic Algorithms (GAs) are powerful metaheuristic techniques mostly used in many real-world applications. The sequential execution of GAs requires considerable computational power both in time and resources. Nevertheless, GAs are naturally parallel and accessing a parallel platform such as Cloud is easy and cheap. Apache Hadoop is one of the common services that can be used for parallel applications. However, using Hadoop to develop a parallel version of GAs is not simple without facing its inner workings. Even though some sequential frameworks for GAs already exist, there is no framework supporting the development of GA applications that can be executed in parallel. In this paper is described a framework for parallel GAs on the Hadoop platform, following the paradigm of MapReduce. The main purpose of this framework is to allow the user to focus on the aspects of GA that are specific to the problem to be addressed, being sure that this task is going to be correctly executed on the Cloud with a good performance. The framework has been also exploited to develop an application for Feature Subset Selection problem. A preliminary analysis of the performance of the developed GA application has been performed using three datasets and shown very promising performance

    Design Architecture-Based on Web Server and Application Cluster in Cloud Environment

    Full text link
    Cloud has been a computational and storage solution for many data centric organizations. The problem today those organizations are facing from the cloud is in data searching in an efficient manner. A framework is required to distribute the work of searching and fetching from thousands of computers. The data in HDFS is scattered and needs lots of time to retrieve. The major idea is to design a web server in the map phase using the jetty web server which will give a fast and efficient way of searching data in MapReduce paradigm. For real time processing on Hadoop, a searchable mechanism is implemented in HDFS by creating a multilevel index in web server with multi-level index keys. The web server uses to handle traffic throughput. By web clustering technology we can improve the application performance. To keep the work down, the load balancer should automatically be able to distribute load to the newly added nodes in the server

    On the Efficacy of Live DDoS Detection with Hadoop

    Full text link
    Distributed Denial of Service flooding attacks are one of the biggest challenges to the availability of online services today. These DDoS attacks overwhelm the victim with huge volume of traffic and render it incapable of performing normal communication or crashes it completely. If there are delays in detecting the flooding attacks, nothing much can be done except to manually disconnect the victim and fix the problem. With the rapid increase of DDoS volume and frequency, the current DDoS detection technologies are challenged to deal with huge attack volume in reasonable and affordable response time. In this paper, we propose HADEC, a Hadoop based Live DDoS Detection framework to tackle efficient analysis of flooding attacks by harnessing MapReduce and HDFS. We implemented a counter-based DDoS detection algorithm for four major flooding attacks (TCP-SYN, HTTP GET, UDP and ICMP) in MapReduce, consisting of map and reduce functions. We deployed a testbed to evaluate the performance of HADEC framework for live DDoS detection. Based on the experiments we showed that HADEC is capable of processing and detecting DDoS attacks in affordable time

    Parallel Processing of Large Graphs

    Full text link
    More and more large data collections are gathered worldwide in various IT systems. Many of them possess the networked nature and need to be processed and analysed as graph structures. Due to their size they require very often usage of parallel paradigm for efficient computation. Three parallel techniques have been compared in the paper: MapReduce, its map-side join extension and Bulk Synchronous Parallel (BSP). They are implemented for two different graph problems: calculation of single source shortest paths (SSSP) and collective classification of graph nodes by means of relational influence propagation (RIP). The methods and algorithms are applied to several network datasets differing in size and structural profile, originating from three domains: telecommunication, multimedia and microblog. The results revealed that iterative graph processing with the BSP implementation always and significantly, even up to 10 times outperforms MapReduce, especially for algorithms with many iterations and sparse communication. Also MapReduce extension based on map-side join usually noticeably presents better efficiency, although not as much as BSP. Nevertheless, MapReduce still remains the good alternative for enormous networks, whose data structures do not fit in local memories.Comment: Preprint submitted to Future Generation Computer System
    • …
    corecore