44,373 research outputs found
i2MapReduce: Incremental MapReduce for Mining Evolving Big Data
As new data and updates are constantly arriving, the results of data mining
applications become stale and obsolete over time. Incremental processing is a
promising approach to refreshing mining results. It utilizes previously saved
states to avoid the expense of re-computation from scratch.
In this paper, we propose i2MapReduce, a novel incremental processing
extension to MapReduce, the most widely used framework for mining big data.
Compared with the state-of-the-art work on Incoop, i2MapReduce (i) performs
key-value pair level incremental processing rather than task level
re-computation, (ii) supports not only one-step computation but also more
sophisticated iterative computation, which is widely used in data mining
applications, and (iii) incorporates a set of novel techniques to reduce I/O
overhead for accessing preserved fine-grain computation states. We evaluate
i2MapReduce using a one-step algorithm and three iterative algorithms with
diverse computation characteristics. Experimental results on Amazon EC2 show
significant performance improvements of i2MapReduce compared to both plain and
iterative MapReduce performing re-computation
Variability Abstractions: Trading Precision for Speed in Family-Based Analyses (Extended Version)
Family-based (lifted) data-flow analysis for Software Product Lines (SPLs) is
capable of analyzing all valid products (variants) without generating any of
them explicitly. It takes as input only the common code base, which encodes all
variants of a SPL, and produces analysis results corresponding to all variants.
However, the computational cost of the lifted analysis still depends inherently
on the number of variants (which is exponential in the number of features, in
the worst case). For a large number of features, the lifted analysis may be too
costly or even infeasible. In this paper, we introduce variability abstractions
defined as Galois connections and use abstract interpretation as a formal
method for the calculational-based derivation of approximate (abstracted)
lifted analyses of SPL programs, which are sound by construction. Moreover,
given an abstraction we define a syntactic transformation that translates any
SPL program into an abstracted version of it, such that the analysis of the
abstracted SPL coincides with the corresponding abstracted analysis of the
original SPL. We implement the transformation in a tool, reconfigurator that
works on Object-Oriented Java program families, and evaluate the practicality
of this approach on three Java SPL benchmarks.Comment: 50 pages, 10 figure
- …