6 research outputs found

    Load Balancing in MapReduce Based on Scalable Cardinality Estimates

    No full text
    Abstract—MapReduce has emerged as a popular tool for distributed and scalable processing of massive data sets and is increasingly being used in e-science applications. Unfortunately, the performance of MapReduce systems strongly depends on an even data distribution, while scientific data sets are often highly skewed. The resulting load imbalance, which raises the processing time, is even amplified by the high runtime complexities of the reducer tasks. An adaptive load balancing strategy is required for appropriate skew handling. In this paper, we address the problem of estimating the cost of the tasks that are distributed to the reducers based on a given cost model. A realistic cost estimation is the basis for adaptive load balancing algorithms and requires to gather statistics from the mappers. This is challenging: (a) Since the statistics from all mappers must be integrated, the mapper statistics must be small. (b) Although each mapper sees only a small fraction of the data, the integrated statistics must capture the global data distribution. (c) The mappers terminate after sending the statistics to the controller, and no second round is possible. Our solution to these challenges consists of two components. First, a monitoring component executed on every mapper captures the local data distribution and identifies its most relevant subset for cost estimation. Second, an integration component aggregates these subsets and approximates the global data distribution. I

    HiSbase: Histogram-based P2P Main Memory Data Management

    No full text
    Many e-science communities, e. g., medicine, climatology, and astrophysics, are overwhelmed by the exponentially growing data volumes that need to be accessible by collaborating researchers. Nowadays, new scientific results are often obtained by explorin

    Defining enthesitis in spondyloarthritis by ultrasound: Results of a delphi process and of a reliability reading exercise

    Get PDF
    Objective: To standardize ultrasound (US) in enthesitis. Methods: An initial Delphi exercise was undertaken to define US-detected enthesitis and its core components. These definitions were subsequently tested on static images taken from spondyloarthritis patients in order to evaluate their reliability. Results: Excellent agreement (>80%) was obtained for including hypoechogenicity, increased thickness of the tendon insertion, calcifications, enthesophytes, erosions, and Doppler activity as core elementary lesions of US-detected enthesitis. US definitions were subsequently obtained for each elementary component. On static images, the intraobserver reliability showed a high degree of variability for the detection of elementary lesions, with kappa coefficients ranging from 0.13-1. The interobserver kappa values were variable, with the lowest kappa coefficient for enthesophytes (0.24) and the highest coefficient for Doppler activity at the enthesis (0.63). Conclusion: This is the first consensus-based US definition of enthesitis and its elementary components and the first step performed to ensure a higher degree of homogeneity and comparability of results between studies and in daily clinical work
    corecore