Search CORE

6,228 research outputs found

Load Balancing a Cluster of Web Servers using Distributed Packet Rewriting

Author: Aversa Luis
Bestavros Azer
Publication venue: Boston University Computer Science Department
Publication date: 06/01/1999
Field of study

In this paper, we propose and evaluate an implementation of a prototype scalable web server. The prototype consists of a load-balanced cluster of hosts that collectively accept and service TCP connections. The host IP addresses are advertised using the Round Robin DNS technique, allowing any host to receive requests from any client. Once a client attempts to establish a TCP connection with one of the hosts, a decision is made as to whether or not the connection should be redirected to a different host---namely, the host with the lowest number of established connections. We use the low-overhead Distributed Packet Rewriting (DPR) technique to redirect TCP connections. In our prototype, each host keeps information about connections in hash tables and linked lists. Every time a packet arrives, it is examined to see if it has to be redirected or not. Load information is maintained using periodic broadcasts amongst the cluster hosts.National Science Foundation (CCR-9706685); Microsof

CiteSeerX

Boston University Institutional Repository (OpenBU)

Transparent and scalable client-side server selection using netlets

Author: Collier Martin
Dharmalingam Kalaiarul
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2003
Field of study

Replication of web content in the Internet has been found to improve service response time, performance and reliability offered by web services. When working with such distributed server systems, the location of servers with respect to client nodes is found to affect service response time perceived by clients in addition to server load conditions. This is due to the characteristics of the network path segments through which client requests get routed. Hence, a number of researchers have advocated making server selection decisions at the client-side of the network. In this paper, we present a transparent approach for client-side server selection in the Internet using Netlet services. Netlets are autonomous, nomadic mobile software components which persist and roam in the network independently, providing predefined network services. In this application, Netlet based services embedded with intelligence to support server selection are deployed by servers close to potential client communities to setup dynamic service decision points within the network. An anycast address is used to identify available distributed decision points in the network. Each service decision point transparently directs client requests to the best performing server based on its in-built intelligence supported by real-time measurements from probes sent by the Netlet to each server. It is shown that the resulting system provides a client-side server selection solution which is server-customisable, scalable and fault transparent

CiteSeerX

Crossref

Irish Universities

DCU Online Research Access Service

Observations on Factors Affecting Performance of MapReduce based Apriori on Hadoop Cluster

Author: Garg Rakhi
Mishra P. K.
Singh Sudhakar
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 21/01/2017
Field of study

Designing fast and scalable algorithm for mining frequent itemsets is always being a most eminent and promising problem of data mining. Apriori is one of the most broadly used and popular algorithm of frequent itemset mining. Designing efficient algorithms on MapReduce framework to process and analyze big datasets is contemporary research nowadays. In this paper, we have focused on the performance of MapReduce based Apriori on homogeneous as well as on heterogeneous Hadoop cluster. We have investigated a number of factors that significantly affects the execution time of MapReduce based Apriori running on homogeneous and heterogeneous Hadoop Cluster. Factors are specific to both algorithmic and non-algorithmic improvements. Considered factors specific to algorithmic improvements are filtered transactions and data structures. Experimental results show that how an appropriate data structure and filtered transactions technique drastically reduce the execution time. The non-algorithmic factors include speculative execution, nodes with poor performance, data locality & distribution of data blocks, and parallelism control with input split size. We have applied strategies against these factors and fine tuned the relevant parameters in our particular application. Experimental results show that if cluster specific parameters are taken care of then there is a significant reduction in execution time. Also we have discussed the issues regarding MapReduce implementation of Apriori which may significantly influence the performance.Comment: 8 pages, 8 figures, International Conference on Computing, Communication and Automation (ICCCA2016

arXiv.org e-Print Archive

Crossref