Search CORE

20,592 research outputs found

Early Accurate Results for Advanced Analytics on MapReduce

Author: Laptev Nikolay
Zaniolo Carlo
Zeng Kai
Publication venue
Publication date: 01/01/2012
Field of study

Approximate results based on samples often provide the only way in which advanced analytical applications on very massive data sets can satisfy their time and resource constraints. Unfortunately, methods and tools for the computation of accurate early results are currently not supported in MapReduce-oriented systems although these are intended for `big data'. Therefore, we proposed and implemented a non-parametric extension of Hadoop which allows the incremental computation of early results for arbitrary work-flows, along with reliable on-line estimates of the degree of accuracy achieved so far in the computation. These estimates are based on a technique called bootstrapping that has been widely employed in statistics and can be applied to arbitrary functions and data distributions. In this paper, we describe our Early Accurate Result Library (EARL) for Hadoop that was designed to minimize the changes required to the MapReduce framework. Various tests of EARL of Hadoop are presented to characterize the frequent situations where EARL can provide major speed-ups over the current version of Hadoop.Comment: VLDB201

arXiv.org e-Print Archive

CiteSeerX

Performance Analysis and Optimization of Sparse Matrix-Vector Multiplication on Modern Multi- and Many-Core Processors

Author: Elafrou Athena
Goumas Georgios
Koziris Nektarios
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 15/11/2017
Field of study

This paper presents a low-overhead optimizer for the ubiquitous sparse matrix-vector multiplication (SpMV) kernel. Architectural diversity among different processors together with structural diversity among different sparse matrices lead to bottleneck diversity. This justifies an SpMV optimizer that is both matrix- and architecture-adaptive through runtime specialization. To this direction, we present an approach that first identifies the performance bottlenecks of SpMV for a given sparse matrix on the target platform either through profiling or by matrix property inspection, and then selects suitable optimizations to tackle those bottlenecks. Our optimization pool is based on the widely used Compressed Sparse Row (CSR) sparse matrix storage format and has low preprocessing overheads, making our overall approach practical even in cases where fast decision making and optimization setup is required. We evaluate our optimizer on three x86-based computing platforms and demonstrate that it is able to distinguish and appropriately optimize SpMV for the majority of matrices in a representative test suite, leading to significant speedups over the CSR and Inspector-Executor CSR SpMV kernels available in the latest release of the Intel MKL library.Comment: 10 pages, 7 figures, ICPP 201

arXiv.org e-Print Archive

NLSC: Unrestricted Natural Language-based Service Composition through Sentence Embeddings

Author: Akoju Sushma A.
Dangi Ankit
Romero Oscar J.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 04/06/2019
Field of study

Current approaches for service composition (assemblies of atomic services) require developers to use: (a) domain-specific semantics to formalize services that restrict the vocabulary for their descriptions, and (b) translation mechanisms for service retrieval to convert unstructured user requests to strongly-typed semantic representations. In our work, we argue that effort to developing service descriptions, request translations, and matching mechanisms could be reduced using unrestricted natural language; allowing both: (1) end-users to intuitively express their needs using natural language, and (2) service developers to develop services without relying on syntactic/semantic description languages. Although there are some natural language-based service composition approaches, they restrict service retrieval to syntactic/semantic matching. With recent developments in Machine learning and Natural Language Processing, we motivate the use of Sentence Embeddings by leveraging richer semantic representations of sentences for service description, matching and retrieval. Experimental results show that service composition development effort may be reduced by more than 44\% while keeping a high precision/recall when matching high-level user requests with low-level service method invocations.Comment: This paper will appear on SCC'19 (IEEE International Conference on Services Computing) on July 1

arXiv.org e-Print Archive

Monte Carlo evaluation of sensitivities in computational finance

Author: Giles M. B.
Publication venue: Unspecified
Publication date: 01/01/2007
Field of study

In computational finance, Monte Carlo simulation is used to compute the correct prices for financial options. More important, however, is the ability to compute the so-called "Greeks'', the first and second order derivatives of the prices with respect to input parameters such as the current asset price, interest rate and level of volatility.\ud \ud This paper discusses the three main approaches to computing Greeks: finite difference, likelihood ratio method (LRM) and pathwise sensitivity calculation. The last of these has an adjoint implementation with a computational cost which is independent of the number of first derivatives to be calculated. We explain how the practical development of adjoint codes is greatly assisted by using Algorithmic Differentiation, and in particular discuss the performance achieved by the FADBAD++ software package which is based on templates and operator overloading within C++.\ud \ud The pathwise approach is not applicable when the financial payoff function is not differentiable, and even when the payoff is differentiable, the use of scripting in real-world implementations means it can be very difficult in practice to evaluate the derivative of very complex financial products. A new idea is presented to address these limitations by combining the adjoint pathwise approach for the stochastic path evolution with LRM for the payoff evaluation

CiteSeerX

Oxford University Research Archive