126 research outputs found
i2MapReduce: Incremental MapReduce for Mining Evolving Big Data
As new data and updates are constantly arriving, the results of data mining
applications become stale and obsolete over time. Incremental processing is a
promising approach to refreshing mining results. It utilizes previously saved
states to avoid the expense of re-computation from scratch.
In this paper, we propose i2MapReduce, a novel incremental processing
extension to MapReduce, the most widely used framework for mining big data.
Compared with the state-of-the-art work on Incoop, i2MapReduce (i) performs
key-value pair level incremental processing rather than task level
re-computation, (ii) supports not only one-step computation but also more
sophisticated iterative computation, which is widely used in data mining
applications, and (iii) incorporates a set of novel techniques to reduce I/O
overhead for accessing preserved fine-grain computation states. We evaluate
i2MapReduce using a one-step algorithm and three iterative algorithms with
diverse computation characteristics. Experimental results on Amazon EC2 show
significant performance improvements of i2MapReduce compared to both plain and
iterative MapReduce performing re-computation
Robust Source-Free Domain Adaptation for Fundus Image Segmentation
Unsupervised Domain Adaptation (UDA) is a learning technique that transfers
knowledge learned in the source domain from labelled training data to the
target domain with only unlabelled data. It is of significant importance to
medical image segmentation because of the usual lack of labelled training data.
Although extensive efforts have been made to optimize UDA techniques to improve
the accuracy of segmentation models in the target domain, few studies have
addressed the robustness of these models under UDA. In this study, we propose a
two-stage training strategy for robust domain adaptation. In the source
training stage, we utilize adversarial sample augmentation to enhance the
robustness and generalization capability of the source model. And in the target
training stage, we propose a novel robust pseudo-label and pseudo-boundary
(PLPB) method, which effectively utilizes unlabeled target data to generate
pseudo labels and pseudo boundaries that enable model self-adaptation without
requiring source data. Extensive experimental results on cross-domain fundus
image segmentation confirm the effectiveness and versatility of our method.
Source code of this study is openly accessible at
https://github.com/LinGrayy/PLPB.Comment: 10 pages, WACV202
- …