5,897 research outputs found
Controlling Network Latency in Mixed Hadoop Clusters: Do We Need Active Queue Management?
With the advent of big data, data center applications are processing vast amounts of unstructured and semi-structured data, in parallel on large clusters, across hundreds to thousands of nodes. The highest performance for these batch big data workloads is achieved using expensive network equipment with large buffers, which accommodate bursts in network traffic and allocate bandwidth fairly even when the network is congested. Throughput-sensitive big data applications are, however, often executed in the same data center as latency-sensitive workloads. For both workloads to be supported well, the network must provide both maximum throughput and low latency. Progress has been made in this direction, as modern network switches support Active Queue Management (AQM) and Explicit Congestion Notifications (ECN), both mechanisms to control the level of queue occupancy, reducing the total network latency. This paper is the first study of the effect of Active Queue Management on both throughput and latency, in the context of Hadoop and the MapReduce programming model. We give a quantitative comparison of four different approaches for controlling buffer occupancy and latency: RED and CoDel, both standalone and also combined with ECN and DCTCP network protocol, and identify the AQM configurations that maintain Hadoop execution time gains from larger buffers within 5%, while reducing network packet latency caused by bufferbloat by up to 85%. Finally, we provide recommendations to administrators of Hadoop clusters as to how to improve latency without degrading the throughput of batch big data workloads.The research leading to these results has received funding from the European Unions Seventh Framework Programme (FP7/2007–2013) under grant agreement number 610456 (Euroserver).
The research was also supported by the Ministry of Economy and Competitiveness of Spain under the contracts TIN2012-34557 and TIN2015-65316-P, Generalitat de Catalunya (contracts 2014-SGR-1051 and 2014-SGR-1272), HiPEAC-3 Network of Excellence (ICT- 287759), and the Severo Ochoa Program (SEV-2011-00067) of the Spanish Government.Peer ReviewedPostprint (author's final draft
Interconnect Energy Savings and Lower Latency Networks in Hadoop Clusters: The Missing Link
An important challenge of modern data centres running Hadoop workloads is to minimise energy consumption, a significant proportion of which is due to the network. Significant network savings are already possible using Energy Efficient Ethernet, supported by a large number of NICs and switches, but recent work has demonstrated that the packet coalescing settings must be carefully configured to avoid a substantial loss in performance. Meanwhile, Hadoop is evolving from its original batch concept to become a more iterative type of framework. Other recent work attempts to reduce Hadoop's network latency using Explicit Congestion Notifications. Linking these studies reveals that, surprisingly, even when packet coalescing does not hurt performance, it can degrade network latency much more than previously thought. This paper is the first to analyze the impact of packet coalescing in the context of network latency. We investigate how to design and configure interconnects to provide the maximum energy savings without degrading cluster throughput performance or network latency.The research leading to these results has received funding from the European Unions Seventh Framework Programme
(FP7/2007–2013) under grant agreement number 610456 (Euroserver).
The research was also supported by the Ministry of Economy and Competitiveness of Spain under the contracts TIN2012-34557 and TIN2015-65316-P, Generalitat de Catalunya (contracts 2014-SGR-1051 and 2014-SGR-1272), HiPEAC-3 Network of Excellence (ICT- 287759), and the Severo Ochoa Program (SEV-2011-00067) of the Spanish
Government.Peer ReviewedPostprint (author's final draft
Diagnostic Experimental Philosophy
Experimental philosophy’s much-discussed ‘restrictionist’ program seeks to delineate the extent to which philosophers may legitimately rely on intuitions about possible cases. The present paper shows that this program can be (i) put to the service of diagnostic problem-resolution (in the wake of J.L. Austin) and (ii) pursued by constructing and experimentally testing psycholinguistic explanations of intuitions which expose their lack of evidentiary value: The paper develops a psycholinguistic explanation of paradoxical intuitions that are prompted by verbal case-descriptions, and presents two experiments that support the explanation. This debunking explanation helps resolve philosophical paradoxes about perception (known as ‘arguments from hallucination’)
Optimal Contracting With Endogenous Social Norms
Research in sociology and ethics suggests that individuals adhere to social norms of behavior established by their peers. Within an agency framework, we model endogenous social norms by assuming that each agent’s cost of implementing an action depends on the social norm for that action, defined to be the average level of that action chosen by the agent’s peer group. We show how endogenous social norms alter the effectiveness of monetary incentives, determine whether it is optimal to group agents in a single or two separate organizations, and may give rise to a costly adverse selection problem when agents\u27 sensitivities to social norms are unobservable
Financial Reporting and Conflicting Managerial Incentives: The Case of Management Buyouts
We analyze the effect of external financing concerns on managers\u27 financial reporting behavior prior to management buyouts (MBOs). Prior studies hypothesize that managers intending to undertake an MBO have an incentive to manage earnings downward to reduce the purchase price. We hypothesize that managers also face a conflicting reporting incentive associated with their efforts to obtain external financing for the MBO and to lower their financing cost. Consistent with our hypothesis, we find that managers who rely the most on external funds to finance their MBOs tend to report less negative abnormal accrual prior to the MBOs. In addition, the relation between external financing and abnormal accruals is tempered when there are more fixed assets that can serve as collateral for debt financing
High Throughput and Low Latency on Hadoop Clusters Using Explicit Congestion Notification: The Untold Truth
Various extensions of TCP/IP have been proposed to reduce network latency; examples include Explicit Congestion Notification (ECN), Data Center TCP (DCTCP) and several proposals for Active Queue Management (AQM). Combining these techniques requires adjusting various parameters, and recent studies have found that it is difficult to do so while obtaining both high performance and low latency. This is especially true for mixed use data centres that host both latency-sensitive applications and high-throughput workloads such as Hadoop.This paper studies the difficulty in configuration, and characterises the problem as related to ACK packets. Such packets cannot be set as ECN Capable Transport (ECT), with the consequence that a disproportionate number of them are dropped. We explain how this behavior decreases throughput, and propose a small change to the way that non-ECT-capable packets are handled in the network switches. We demonstrate robust performance for modified AQMs on a Hadoop cluster, maintaining full throughput while reducing latency by 85%. We also demonstrate that commodity switches with shallow buffers are able to reach the same throughput as deeper buffer switches. Finally, we explain how both TCP-ECN and DCTCP can achieve the best performance using a simple marking scheme, in constrast to the current preference for relying on AQMs to mark packets.The research leading to these results has received funding from the European Unions Seventh Framework Programme (FP7/2007–2013) under grant agreement number 610456 (Euroserver).
The research was also supported by the Ministry of Economy and Competitiveness of Spain under the contracts TIN2012-34557 and TIN2015-65316-P, Generalitat de Catalunya (contracts 2014-SGR-1051 and 2014-SGR-1272), HiPEAC-3 Network of Excellence (ICT- 287759), and the Severo Ochoa Program (SEV-2011-00067) of the Spanish
Government.Peer ReviewedPostprint (author's final draft
Exploring interconnect energy savings under East-West traffic pattern of MapReduce clusters
An important challenge of modern data centers is to reduce energy consumption, of which a substantial proportion is due to the network. Energy Efficient Ethernet (EEE) is a recent standard that aims to reduce network power consumption, but current practice is to disable it in production use, since it has a poorly understood impact on real world application performance. An important application framework commonly used in modern data centers is Apache Hadoop, which implements the MapReduce programming model. This paper is the first to analyse the impact of EEE on MapReduce workloads, in terms of performance overheads and energy savings. We find that optimum energy savings are possible if the links use
packet coalescing. Packet coalescing must, however, be carefully configured in order to avoid excessive performance degradation.The research leading to these results has received funding from the European Union’s Seventh Framework Programme (FP7/2007–2013) under grant agreement number 610456 (Euroserver).
The research was also supported by the Ministry of Economy and Competitiveness of Spain under the contract TIN2012-34557, HiPEAC-3 Network of Excellence (ICT-287759), and the Severo Ochoa Program (SEV-2011-00067) of the Spanish Government.Postprint (author's final draft
Energy Efficient Ethernet on MapReduce Clusters: Packet Coalescing To Improve 10GbE Links
An important challenge of modern data centers is to reduce energy consumption, of which a substantial proportion is due to the network. Switches and NICs supporting the recent energy efficient Ethernet (EEE) standard are now available, but current practice is to disable EEE in production use, since its effect on real world application performance is poorly understood. This paper contributes to this discussion by analyzing the impact of EEE on MapReduce workloads, in terms of performance overheads and energy savings. MapReduce is the central programming model of Apache Hadoop, one of the most widely used application frameworks in modern data centers. We find that, while 1GbE links (edge links) achieve good energy savings using the standard EEE implementation, optimum energy savings in the 10 GbE links (aggregation and core links) are only possible, if these links employ packet coalescing. Packet coalescing must, however, be carefully configured in order to avoid excessive performance degradation. With our new analysis of how the static parameters of packet coalescing perform under different cluster loads, we were able to cover both idle and heavy load periods that can exist on this type of environment. Finally, we evaluate our recommendation for packet coalescing for 10 GbE links using the energy-delay metric. This paper is an extension of our previous work [1], which was published in the Proceedings of the 40th Annual IEEE Conference on Local Computer Networks (LCN 2015).This work was supported in part by the
European Union’s Seventh Framework Programme (FP7/2007-2013) under Grant 610456 (EUROSERVER), in part by the Spanish Government through the Severo Ochoa programme (SEV-2011-00067 and SEV-2015-0493), in part by the Spanish Ministry of Economy a nd Competitiveness under Contract TIN2012-34557 and Contract TIN2015-65316-P, and in part by the Generalitat de Catalunya under Contract 2014-SGR-1051 and Contract 2014-SGR-1272.Peer ReviewedPostprint (author's final draft
Beliefs-Driven Price Association
In addition to being a function of traditional fundamentals such as cash-flow persistence and the discount rate, the equilibrium association between a security price and a value-relevant statistic can simply be a function of what rational investors believe the association will be. We refer to this phenomenon as beliefs-driven price association (BPA). By explicitly considering the phenomenon of BPA, we show that the price response to information releases can vary over time even if the risk-free interest rate and investor preferences are static and the earnings/cash flow generating process is stable. This observation suggests, for example, that price-to-earnings associations and price volatility can vary over time even if a stable pattern of economic fundamentals suggests otherwise. The possibility of BPA suggests that measures of the cost of capital, information content, and growth prospects inferred from observed market prices will be confounded. While we do not predict when periods of BPA will arise, we provide empirically testable predictions about how prices should behave during periods of BPA. In particular, we predict that, during sufficiently long periods of high (positive or negative) BPA, price volatility, price levels, and expected returns will be higher than would be implied by a fundamental valuation framework. Finally, while BPA in the pricing of one security does not cause BPA in the pricing of other securities, the price levels of those other securities will be affected if the securities with BPA are sufficiently large relative to the market as a whole
- …