39,724 research outputs found

    Early Accurate Results for Advanced Analytics on MapReduce

    Full text link
    Approximate results based on samples often provide the only way in which advanced analytical applications on very massive data sets can satisfy their time and resource constraints. Unfortunately, methods and tools for the computation of accurate early results are currently not supported in MapReduce-oriented systems although these are intended for `big data'. Therefore, we proposed and implemented a non-parametric extension of Hadoop which allows the incremental computation of early results for arbitrary work-flows, along with reliable on-line estimates of the degree of accuracy achieved so far in the computation. These estimates are based on a technique called bootstrapping that has been widely employed in statistics and can be applied to arbitrary functions and data distributions. In this paper, we describe our Early Accurate Result Library (EARL) for Hadoop that was designed to minimize the changes required to the MapReduce framework. Various tests of EARL of Hadoop are presented to characterize the frequent situations where EARL can provide major speed-ups over the current version of Hadoop.Comment: VLDB201

    Benchmarking Summarizability Processing in XML Warehouses with Complex Hierarchies

    Full text link
    Business Intelligence plays an important role in decision making. Based on data warehouses and Online Analytical Processing, a business intelligence tool can be used to analyze complex data. Still, summarizability issues in data warehouses cause ineffective analyses that may become critical problems to businesses. To settle this issue, many researchers have studied and proposed various solutions, both in relational and XML data warehouses. However, they find difficulty in evaluating the performance of their proposals since the available benchmarks lack complex hierarchies. In order to contribute to summarizability analysis, this paper proposes an extension to the XML warehouse benchmark (XWeB) with complex hierarchies. The benchmark enables us to generate XML data warehouses with scalable complex hierarchies as well as summarizability processing. We experimentally demonstrated that complex hierarchies can definitely be included into a benchmark dataset, and that our benchmark is able to compare two alternative approaches dealing with summarizability issues.Comment: 15th International Workshop on Data Warehousing and OLAP (DOLAP 2012), Maui : United States (2012

    EAGLE—A Scalable Query Processing Engine for Linked Sensor Data

    Get PDF
    Recently, many approaches have been proposed to manage sensor data using semantic web technologies for effective heterogeneous data integration. However, our empirical observations revealed that these solutions primarily focused on semantic relationships and unfortunately paid less attention to spatio–temporal correlations. Most semantic approaches do not have spatio–temporal support. Some of them have attempted to provide full spatio–temporal support, but have poor performance for complex spatio–temporal aggregate queries. In addition, while the volume of sensor data is rapidly growing, the challenge of querying and managing the massive volumes of data generated by sensing devices still remains unsolved. In this article, we introduce EAGLE, a spatio–temporal query engine for querying sensor data based on the linked data model. The ultimate goal of EAGLE is to provide an elastic and scalable system which allows fast searching and analysis with respect to the relationships of space, time and semantics in sensor data. We also extend SPARQL with a set of new query operators in order to support spatio–temporal computing in the linked sensor data context.EC/H2020/732679/EU/ACTivating InnoVative IoT smart living environments for AGEing well/ACTIVAGEEC/H2020/661180/EU/A Scalable and Elastic Platform for Near-Realtime Analytics for The Graph of Everything/SMARTE

    A framework for effective management of condition based maintenance programs in the context of industrial development of E-Maintenance strategies

    Get PDF
    CBM (Condition Based Maintenance) solutions are increasingly present in industrial systems due to two main circumstances: rapid evolution, without precedents, in the capture and analysis of data and significant cost reduction of supporting technologies. CBM programs in industrial systems can become extremely complex, especially when considering the effective introduction of new capabilities provided by PHM (Prognostics and Health Management) and E-maintenance disciplines. In this scenario, any CBM solution involves the management of numerous technical aspects, that the maintenance manager needs to understand, in order to be implemented properly and effectively, according to the company’s strategy. This paper provides a comprehensive representation of the key components of a generic CBM solution, this is presented using a framework or supporting structure for an effective management of the CBM programs. The concept “symptom of failure”, its corresponding analysis techniques (introduced by ISO 13379-1 and linked with RCM/FMEA analysis), and other international standard for CBM open-software application development (for instance, ISO 13374 and OSA-CBM), are used in the paper for the development of the framework. An original template has been developed, adopting the formal structure of RCM analysis templates, to integrate the information of the PHM techniques used to capture the failure mode behaviour and to manage maintenance. Finally, a case study describes the framework using the referred template.Gobierno de Andalucía P11-TEP-7303 M
    • …
    corecore