Search CORE

371,170 research outputs found

Early Accurate Results for Advanced Analytics on MapReduce

Author: Laptev Nikolay
Zaniolo Carlo
Zeng Kai
Publication venue
Publication date: 01/01/2012
Field of study

Approximate results based on samples often provide the only way in which advanced analytical applications on very massive data sets can satisfy their time and resource constraints. Unfortunately, methods and tools for the computation of accurate early results are currently not supported in MapReduce-oriented systems although these are intended for `big data'. Therefore, we proposed and implemented a non-parametric extension of Hadoop which allows the incremental computation of early results for arbitrary work-flows, along with reliable on-line estimates of the degree of accuracy achieved so far in the computation. These estimates are based on a technique called bootstrapping that has been widely employed in statistics and can be applied to arbitrary functions and data distributions. In this paper, we describe our Early Accurate Result Library (EARL) for Hadoop that was designed to minimize the changes required to the MapReduce framework. Various tests of EARL of Hadoop are presented to characterize the frequent situations where EARL can provide major speed-ups over the current version of Hadoop.Comment: VLDB201

arXiv.org e-Print Archive

CiteSeerX

Approximation of empowerment in the continuous domain

Author: Glackin Cornelius
Polani D.
Salge Christoph
Publication venue: 'World Scientific Pub Co Pte Lt'
Publication date: 01/01/2012
Field of study

The empowerment formalism offers a goal-independent utility function fully derived from an agent's embodiment. It produces intrinsic motivations which can be used to generate self-organizing behaviours in agents. One obstacle to the application of empowerment in more demanding (esp. continuous) domains is that previous ways of calculating empowerment have been very time consuming and only provided a proof-of-concept. In this paper we present a new approach to efficiently approximate empowerment as a parallel, linear, Gaussian channel capacity problem. We use pendulum balancing to demonstrate this new method, and compare it to earlier approximation methods.Peer reviewe

Crossref

University of Hertfordshire Research Archive

Audiovisual preservation strategies, data models and value-chains

Author: Addis Matthew
Wright Richard
Publication venue: University of Southampton; Prestoprime Consortium
Publication date: 04/03/2010
Field of study

This is a report on preservation strategies, models and value-chains for digital file-based audiovisual content. The report includes: (a)current and emerging value-chains and business-models for audiovisual preservation;(b) a comparison of preservation strategies for audiovisual content including their strengths and weaknesses, and(c) a review of current preservation metadata models, and requirements for extension to support audiovisual files

Southampton (e-Prints Soton)

Importance Sampling: Intrinsic Dimension and Computational Cost

Author: Agapiou S.
Papaspiliopoulos O.
Sanz-Alonso D.
Stuart A. M.
Publication venue
Publication date: 14/01/2017
Field of study

The basic idea of importance sampling is to use independent samples from a proposal measure in order to approximate expectations with respect to a target measure. It is key to understand how many samples are required in order to guarantee accurate approximations. Intuitively, some notion of distance between the target and the proposal should determine the computational cost of the method. A major challenge is to quantify this distance in terms of parameters or statistics that are pertinent for the practitioner. The subject has attracted substantial interest from within a variety of communities. The objective of this paper is to overview and unify the resulting literature by creating an overarching framework. A general theory is presented, with a focus on the use of importance sampling in Bayesian inverse problems and filtering.Comment: Statistical Scienc

arXiv.org e-Print Archive

Crossref

Caltech Authors

Warwick Research Archives Portal Repository