3,267 research outputs found

    Performance Characterization of In-Memory Data Analytics on a Modern Cloud Server

    Full text link
    In last decade, data analytics have rapidly progressed from traditional disk-based processing to modern in-memory processing. However, little effort has been devoted at enhancing performance at micro-architecture level. This paper characterizes the performance of in-memory data analytics using Apache Spark framework. We use a single node NUMA machine and identify the bottlenecks hampering the scalability of workloads. We also quantify the inefficiencies at micro-architecture level for various data analysis workloads. Through empirical evaluation, we show that spark workloads do not scale linearly beyond twelve threads, due to work time inflation and thread level load imbalance. Further, at the micro-architecture level, we observe memory bound latency to be the major cause of work time inflation.Comment: Accepted to The 5th IEEE International Conference on Big Data and Cloud Computing (BDCloud 2015

    TaskPoint: sampled simulation of task-based programs

    Get PDF
    Sampled simulation is a mature technique for reducing simulation time of single-threaded programs, but it is not directly applicable to simulation of multi-threaded architectures. Recent multi-threaded sampling techniques assume that the workload assigned to each thread does not change across multiple executions of a program. This assumption does not hold for dynamically scheduled task-based programming models. Task-based programming models allow the programmer to specify program segments as tasks which are instantiated many times and scheduled dynamically to available threads. Due to system noise and variation in scheduling decisions, two consecutive executions on the same machine typically result in different instruction streams processed by each thread. In this paper, we propose TaskPoint, a sampled simulation technique for dynamically scheduled task-based programs. We leverage task instances as sampling units and simulate only a fraction of all task instances in detail. Between detailed simulation intervals we employ a novel fast-forward mechanism for dynamically scheduled programs. We evaluate the proposed technique on a set of 19 task-based parallel benchmarks and two different architectures. Compared to detailed simulation, TaskPoint accelerates architectural simulation with 64 simulated threads by an average factor of 19.1 at an average error of 1.8% and a maximum error of 15.0%.This work has been supported by the Spanish Government (Severo Ochoa grants SEV2015-0493, SEV-2011-00067), the Spanish Ministry of Science and Innovation (contract TIN2015-65316-P), Generalitat de Catalunya (contracts 2014-SGR-1051 and 2014-SGR-1272), the RoMoL ERC Advanced Grant (GA 321253), the European HiPEAC Network of Excellence and the Mont-Blanc project (EU-FP7-610402 and EU-H2020-671697). M. Moreto has been partially supported by the Ministry of Economy and Competitiveness under Juan de la Cierva postdoctoral fellowship JCI-2012-15047. M. Casas is supported by the Ministry of Economy and Knowledge of the Government of Catalonia and the Cofund programme of the Marie Curie Actions of the EUFP7 (contract 2013BP B 00243). T.Grass has been partially supported by the AGAUR of the Generalitat de Catalunya (grant 2013FI B 0058).Peer ReviewedPostprint (author's final draft

    The Whole is Greater than the Sum of the Parts: Optimizing the Joint Science Return from LSST, Euclid and WFIRST

    Get PDF
    The focus of this report is on the opportunities enabled by the combination of LSST, Euclid and WFIRST, the optical surveys that will be an essential part of the next decade's astronomy. The sum of these surveys has the potential to be significantly greater than the contributions of the individual parts. As is detailed in this report, the combination of these surveys should give us multi-wavelength high-resolution images of galaxies and broadband data covering much of the stellar energy spectrum. These stellar and galactic data have the potential of yielding new insights into topics ranging from the formation history of the Milky Way to the mass of the neutrino. However, enabling the astronomy community to fully exploit this multi-instrument data set is a challenging technical task: for much of the science, we will need to combine the photometry across multiple wavelengths with varying spectral and spatial resolution. We identify some of the key science enabled by the combined surveys and the key technical challenges in achieving the synergies.Comment: Whitepaper developed at June 2014 U. Penn Workshop; 28 pages, 3 figure

    An Efficient OpenMP Loop Scheduler for Irregular Applications on Large-Scale NUMA Machines

    Get PDF
    International audienceNowadays shared memory HPC platforms expose a large number of cores organized in a hierarchical way. Parallel application programmers strug- gle to express more and more fine-grain parallelism and to ensure locality on such NUMA platforms. Independent loops stand as a natural source of paral- lelism. Parallel environments like OpenMP provide ways of parallelizing them efficiently, but the achieved performance is closely related to the choice of pa- rameters like the granularity of work or the loop scheduler. Considering that both can depend on the target computer, the input data and the loop workload, the application programmer most of the time fails at designing both portable and ef- ficient implementations. We propose in this paper a new OpenMP loop scheduler, called adaptive, that dynamically adapts the granularity of work considering the underlying system state. Our scheduler is able to perform dynamic load balancing while taking memory affinity into account on NUMA architectures. Results show that adaptive outperforms state-of-the-art OpenMP loop schedulers on memory- bound irregular applications, while obtaining performance comparable to static on parallel loops with a regular workload

    Finance, growth, and public policy

    Get PDF
    Development economists have long argued that modern financial markets are important to growth and that financial repression is a serious obstacle to progress in many developing countries. The authors consider the relationship between finance and growth and the appropriate role of government policy. Many economists have stressed how problems of asymmetric information and contract enforcement impede the functioning of financial markets in developing countries. In addition, they try to elaborate on these theories to make them relevant to policymakers. Information gaps and enforcement frictions introduce a premium in the cost of external funds. Factors such as the borrower's financial health, the efficiency of financial intermediation, and the ease of enforcing private financial contracts govern the size of this premium. How financial factors contribute to development may be understood along these lines. Financial contracts and institutions should be designed to minimize this premium.Banks&Banking Reform,Financial Intermediation,Environmental Economics&Policies,Economic Theory&Research,Health Economics&Finance

    Special Issue: Selected Papers from Super Computing 2012

    Get PDF

    Overall requirements for an advanced underground coal extraction system

    Get PDF
    Underground mining systems suitable for coal seams expoitable in the year 2000 are examined with particular relevance to the resources of Central Appalachia. Requirements for such systems may be summarized as follows: (1) production cost; (2)miner safety; (3) miner health; (4) environmental impact; and (5) coal conservation. No significant trade offs between production cost and other performance indices were found
    • …
    corecore