1,860 research outputs found
Modeling performance of Hadoop applications: A journey from queueing networks to stochastic well formed nets
Nowadays, many enterprises commit to the extraction of actionable knowledge from huge datasets as part of their core business activities. Applications belong to very different domains such as fraud detection or one-to-one marketing, and encompass business analytics and support to decision making in both private and public sectors. In these scenarios, a central place is held by the MapReduce framework and in particular its open source implementation, Apache Hadoop. In such environments, new challenges arise in the area of jobs performance prediction, with the needs to provide Service Level Agreement guarantees to the enduser and to avoid waste of computational resources. In this paper we provide performance analysis models to estimate MapReduce job execution times in Hadoop clusters governed by the YARN Capacity Scheduler. We propose models of increasing complexity and accuracy, ranging from queueing networks to stochastic well formed nets, able to estimate job performance under a number of scenarios of interest, including also unreliable resources. The accuracy of our models is evaluated by considering the TPC-DS industry benchmark running experiments on Amazon EC2 and the CINECA Italian supercomputing center. The results have shown that the average accuracy we can achieve is in the range 9–14%
Recommended from our members
Theory and practice of firewall outsourcing
A firewall system is a packet filter that is placed at the entry point of an enterprise network in the Internet. Packets that attempt to enter the enterprise network through this entry point are examined, one by one, against the rules of some underlying firewall F of the firewall system. Each rule in F has a decision which is either “accept” or “reject”. For any incoming packet p, the firewall system identifies the first rule (in the sequence of rules in F) that matches p. If the decision of this rule is “accept”, then the firewall system forwards p to the enterprise network. Otherwise the decision of this rule is “reject” and packet p is discarded and prevented from entering the network. Each firewall system consists of two units: a rule matching unit and a decision unit. Both units are usually executed in the firewall system. To simplify the task of managing the firewall system, we identify a special class of firewall systems, called the outsourced system, where the rule matching unit is executed in a public cloud. Unfortunately, public clouds are usually unreliable and execution of the rule matching unit in a public cloud can be vulnerable to two types of attacks: verifiability attacks and privacy attacks. The main objective of this dissertation is to discuss how to execute the rule matching unit of an outsourced system in a public cloud such that verifiability and privacy attacks are prevented from occurring. The main contribution of this dissertation is three-fold. First, we discuss how to design outsourced firewall system such that execution of the designed system in the public clouds prevents the occurrence of verifiability and privacy attacks. The resulting system, called the private system, make use of two public clouds. We show that this private system prevents verifiability and privacy attacks under the assumption that the two public clouds used in this system are both “sensible” and “non-colluding”. Second, we identify a special class of firewalls, called the partially specified firewall, where a firewall is called partially specified when the decisions of some of the rules in the firewall are not specified as “accept” or “reject”. We show that for every partially specified firewall PF, there is a (fully specified) firewall F such that PF and F are equivalent. We discuss how to design an outsourced system whose underlying firewall is a partially specified firewall PF such that the designed system prevents both verifiability and privacy attacks. We achieve this outsourced system by obtaining an equivalent firewall F from PF and designing a private system for F. Third, we present a generalization of firewalls called firewall expressions. A firewall expression is specified using one or more component firewalls and three firewall operators: “not”, “and”, and “or”. For example, the firewall expression (G and H) consists of two component firewalls G and H and one firewall operator “and”. This firewall expression accepts a packet p iff both firewalls G and H accept p. For any underlying firewall expression FE, we define an Expression System as a generalization of firewall systems that takes as input any packet p and determines whether the underlying firewall expression FE accepts or rejects packet p. We design an outsourced expression system for any underlying firewall expression FE. We achieve this outsourced expression system by using a private system for each component firewall of FE and combining these private systems through an overall decision unit to determine whether any packet is accepted or rejected according to the firewall expression FEComputer Science
- …