11,550 research outputs found

    ALOJA: A benchmarking and predictive platform for big data performance analysis

    Get PDF
    The main goals of the ALOJA research project from BSC-MSR, are to explore and automate the characterization of cost-effectivenessof Big Data deployments. The development of the project over its first year, has resulted in a open source benchmarking platform, an online public repository of results with over 42,000 Hadoop job runs, and web-based analytic tools to gather insights about system's cost-performance1. This article describes the evolution of the project's focus and research lines from over a year of continuously benchmarking Hadoop under dif- ferent configuration and deployments options, presents results, and dis cusses the motivation both technical and market-based of such changes. During this time, ALOJA's target has evolved from a previous low-level profiling of Hadoop runtime, passing through extensive benchmarking and evaluation of a large body of results via aggregation, to currently leveraging Predictive Analytics (PA) techniques. Modeling benchmark executions allow us to estimate the results of new or untested configu- rations or hardware set-ups automatically, by learning techniques from past observations saving in benchmarking time and costs.This work is partially supported the BSC-Microsoft Research Centre, the Span- ish Ministry of Education (TIN2012-34557), the MINECO Severo Ochoa Research program (SEV-2011-0067) and the Generalitat de Catalunya (2014-SGR-1051).Peer ReviewedPostprint (author's final draft

    Scratchpad Sharing in GPUs

    Full text link
    GPGPU applications exploit on-chip scratchpad memory available in the Graphics Processing Units (GPUs) to improve performance. The amount of thread level parallelism present in the GPU is limited by the number of resident threads, which in turn depends on the availability of scratchpad memory in its streaming multiprocessor (SM). Since the scratchpad memory is allocated at thread block granularity, part of the memory may remain unutilized. In this paper, we propose architectural and compiler optimizations to improve the scratchpad utilization. Our approach, Scratchpad Sharing, addresses scratchpad under-utilization by launching additional thread blocks in each SM. These thread blocks use unutilized scratchpad and also share scratchpad with other resident blocks. To improve the performance of scratchpad sharing, we propose Owner Warp First (OWF) scheduling that schedules warps from the additional thread blocks effectively. The performance of this approach, however, is limited by the availability of the shared part of scratchpad. We propose compiler optimizations to improve the availability of shared scratchpad. We describe a scratchpad allocation scheme that helps in allocating scratchpad variables such that shared scratchpad is accessed for short duration. We introduce a new instruction, relssp, that when executed, releases the shared scratchpad. Finally, we describe an analysis for optimal placement of relssp instructions such that shared scratchpad is released as early as possible. We implemented the hardware changes using the GPGPU-Sim simulator and implemented the compiler optimizations in Ocelot framework. We evaluated the effectiveness of our approach on 19 kernels from 3 benchmarks suites: CUDA-SDK, GPGPU-Sim, and Rodinia. The kernels that underutilize scratchpad memory show an average improvement of 19% and maximum improvement of 92.17% compared to the baseline approach

    Multidimensional Scaling with Regional Restrictions for Facet Theory: An Application to Levi's Political Protest Data

    Get PDF
    Multidimensional scaling (MDS) is often used for the analysis of correlation matrices of items generated by a facet theory design. The emphasis of the analysis is on regional hypotheses on the location of the items in the MDS solution. An important regional hypothesis is the axial constraint where the items from different levels of a facet are assumed to be located in different parallel slices. The simplest approach is to do an MDS and draw the parallel lines separating the slices as good as possible by hand. Alternatively, Borg and Shye (1995) propose to automate the second step. Borg and Groenen (1997, 2005) proposed a simultaneous approach for ordered facets when the number of MDS dimensions equals the number of facets. In this paper, we propose a new algorithm that estimates an MDS solution subject to axial constraints without the restriction that the number of facets equals the number of dimensions. The algorithm is based on constrained iterative majorization of De Leeuw and Heiser (1980) with special constraints. This algorithm is applied to Levi’s (1983) data on political protests.Axial Partitioning;Constrained Estimation;Facet Theory;Iterative Majorization;Multidimensional Scaling;Regional Restrictions

    Automated Generation of Geometric Theorems from Images of Diagrams

    Full text link
    We propose an approach to generate geometric theorems from electronic images of diagrams automatically. The approach makes use of techniques of Hough transform to recognize geometric objects and their labels and of numeric verification to mine basic geometric relations. Candidate propositions are generated from the retrieved information by using six strategies and geometric theorems are obtained from the candidates via algebraic computation. Experiments with a preliminary implementation illustrate the effectiveness and efficiency of the proposed approach for generating nontrivial theorems from images of diagrams. This work demonstrates the feasibility of automated discovery of profound geometric knowledge from simple image data and has potential applications in geometric knowledge management and education.Comment: 31 pages. Submitted to Annals of Mathematics and Artificial Intelligence (special issue on Geometric Reasoning

    REPP-H: runtime estimation of power and performance on heterogeneous data centers

    Get PDF
    Modern data centers increasingly demand improved performance with minimal power consumption. Managing the power and performance requirements of the applications is challenging because these data centers, incidentally or intentionally, have to deal with server architecture heterogeneity [19], [22]. One critical challenge that data centers have to face is how to manage system power and performance given the different application behavior across multiple different architectures.This work has been supported by the EU FP7 program (Mont-Blanc 2, ICT-610402), by the Ministerio de Economia (CAP-VII, TIN2015-65316-P), and the Generalitat de Catalunya (MPEXPAR, 2014-SGR-1051). The material herein is based in part upon work supported by the US NSF, grant numbers ACI-1535232 and CNS-1305220.Peer ReviewedPostprint (author's final draft

    A Computational Study of Genetic Crossover Operators for Multi-Objective Vehicle Routing Problem with Soft Time Windows

    Full text link
    The article describes an investigation of the effectiveness of genetic algorithms for multi-objective combinatorial optimization (MOCO) by presenting an application for the vehicle routing problem with soft time windows. The work is motivated by the question, if and how the problem structure influences the effectiveness of different configurations of the genetic algorithm. Computational results are presented for different classes of vehicle routing problems, varying in their coverage with time windows, time window size, distribution and number of customers. The results are compared with a simple, but effective local search approach for multi-objective combinatorial optimization problems
    corecore