2,018 research outputs found
i2MapReduce: Incremental MapReduce for Mining Evolving Big Data
As new data and updates are constantly arriving, the results of data mining
applications become stale and obsolete over time. Incremental processing is a
promising approach to refreshing mining results. It utilizes previously saved
states to avoid the expense of re-computation from scratch.
In this paper, we propose i2MapReduce, a novel incremental processing
extension to MapReduce, the most widely used framework for mining big data.
Compared with the state-of-the-art work on Incoop, i2MapReduce (i) performs
key-value pair level incremental processing rather than task level
re-computation, (ii) supports not only one-step computation but also more
sophisticated iterative computation, which is widely used in data mining
applications, and (iii) incorporates a set of novel techniques to reduce I/O
overhead for accessing preserved fine-grain computation states. We evaluate
i2MapReduce using a one-step algorithm and three iterative algorithms with
diverse computation characteristics. Experimental results on Amazon EC2 show
significant performance improvements of i2MapReduce compared to both plain and
iterative MapReduce performing re-computation
Dynamics of social contagions with local trend imitation
Research on social contagion dynamics has not yet including a theoretical
analysis of the ubiquitous local trend imitation (LTI) characteristic. We
propose a social contagion model with a tent-like adoption probability
distribution to investigate the effect of this LTI characteristic on behavior
spreading. We also propose a generalized edge-based compartmental theory to
describe the proposed model. Through extensive numerical simulations and
theoretical analyses, we find a crossover in the phase transition: when the LTI
capacity is strong, the growth of the final behavior adoption size exhibits a
second-order phase transition. When the LTI capacity is weak, we see a
first-order phase transition. For a given behavioral information transmission
probability, there is an optimal LTI capacity that maximizes the final behavior
adoption size. Finally we find that the above phenomena are not qualitatively
affected by the heterogeneous degree distribution. Our suggested theory agrees
with the simulation results.Comment: 14 pages, 5 figure
- …