Search CORE

44,373 research outputs found

Putting Context into Schema Matching

Author: Bohannon Philip
Elnahrawy Eiman
Fan Wenfei
Flaster Michael
Publication venue
Publication date: 01/01/2006
Field of study

i2MapReduce: Incremental MapReduce for Mining Evolving Big Data

Author: Chen Shimin
Wang Qiang
Yu Ge
Zhang Yanfeng
Publication venue
Publication date: 20/01/2015
Field of study

As new data and updates are constantly arriving, the results of data mining applications become stale and obsolete over time. Incremental processing is a promising approach to refreshing mining results. It utilizes previously saved states to avoid the expense of re-computation from scratch. In this paper, we propose i2MapReduce, a novel incremental processing extension to MapReduce, the most widely used framework for mining big data. Compared with the state-of-the-art work on Incoop, i2MapReduce (i) performs key-value pair level incremental processing rather than task level re-computation, (ii) supports not only one-step computation but also more sophisticated iterative computation, which is widely used in data mining applications, and (iii) incorporates a set of novel techniques to reduce I/O overhead for accessing preserved fine-grain computation states. We evaluate i2MapReduce using a one-step algorithm and three iterative algorithms with diverse computation characteristics. Experimental results on Amazon EC2 show significant performance improvements of i2MapReduce compared to both plain and iterative MapReduce performing re-computation

arXiv.org e-Print Archive

CiteSeerX

Crossref

Variability Abstractions: Trading Precision for Speed in Family-Based Analyses (Extended Version)

Author: Brabrand Claus
Dimovski Aleksandar S.
Wąsowski Andrzej
Publication venue
Publication date: 16/03/2015
Field of study

Family-based (lifted) data-flow analysis for Software Product Lines (SPLs) is capable of analyzing all valid products (variants) without generating any of them explicitly. It takes as input only the common code base, which encodes all variants of a SPL, and produces analysis results corresponding to all variants. However, the computational cost of the lifted analysis still depends inherently on the number of variants (which is exponential in the number of features, in the worst case). For a large number of features, the lifted analysis may be too costly or even infeasible. In this paper, we introduce variability abstractions defined as Galois connections and use abstract interpretation as a formal method for the calculational-based derivation of approximate (abstracted) lifted analyses of SPL programs, which are sound by construction. Moreover, given an abstraction we define a syntactic transformation that translates any SPL program into an abstracted version of it, such that the analysis of the abstracted SPL coincides with the corresponding abstracted analysis of the original SPL. We implement the transformation in a tool, reconfigurator that works on Object-Oriented Java program families, and evaluate the practicality of this approach on three Java SPL benchmarks.Comment: 50 pages, 10 figure

arXiv.org e-Print Archive

CiteSeerX