155 research outputs found

    The Family of MapReduce and Large Scale Data Processing Systems

    Full text link
    In the last two decades, the continuous increase of computational power has produced an overwhelming flow of data which has called for a paradigm shift in the computing architecture and large scale data processing mechanisms. MapReduce is a simple and powerful programming model that enables easy development of scalable parallel applications to process vast amounts of data on large clusters of commodity machines. It isolates the application from the details of running a distributed program such as issues on data distribution, scheduling and fault tolerance. However, the original implementation of the MapReduce framework had some limitations that have been tackled by many research efforts in several followup works after its introduction. This article provides a comprehensive survey for a family of approaches and mechanisms of large scale data processing mechanisms that have been implemented based on the original idea of the MapReduce framework and are currently gaining a lot of momentum in both research and industrial communities. We also cover a set of introduced systems that have been implemented to provide declarative programming interfaces on top of the MapReduce framework. In addition, we review several large scale data processing systems that resemble some of the ideas of the MapReduce framework for different purposes and application scenarios. Finally, we discuss some of the future research directions for implementing the next generation of MapReduce-like solutions.Comment: arXiv admin note: text overlap with arXiv:1105.4252 by other author

    The perimeter of uniform and geometric words: a probabilistic analysis

    Get PDF
    Let a word be a sequence of nn i.i.d. integer random variables. The perimeter PP of the word is the number of edges of the word, seen as a polyomino. In this paper, we present a probabilistic approach to the computation of the moments of PP. This is applied to uniform and geometric random variables. We also show that, asymptotically, the distribution of PP is Gaussian and, seen as a stochastic process, the perimeter converges in distribution to a Brownian motionComment: 13 pages, 7 figure

    MINARET: A Recommendation Framework for Scientific Reviewers

    Get PDF
    International audienceWe are witnessing a continuous growth in the size of scientific communities and the number of scientific publications. This phenomenon requires a continuous effort for ensuring the quality of publications and a healthy scientific evaluation process. Peer reviewing is the de facto mechanism to assess the quality of scientific work. For journal editors, managing an efficient and effective manuscript peer review process is not a straightforward task. In particular, a main component in the journal editors' role is, for each submitted manuscript, to ensure selecting adequate reviewers who need to be: 1) Matching on their research interests with the topic of the submission, 2) Fair in their evaluation of the submission, i.e., no conflict of interest with the authors, 3) Qualified in terms of various aspects including scientific impact, previous review/authorship experience for the journal , quality of the reviews, etc. Thus, manually selecting and assessing the adequate reviewers is becoming tedious and time consuming task. We demonstrate MINARET, a recommendation framework for selecting scientific reviewers. The framework facilitates the job of journal editors for conducting an efficient and effective scientific review process. The framework exploits the valuable information available on the modern scholarly Websites (e.g., Google Scholar, ACM DL, DBLP, Publons) for identifying candidate reviewers relevant to the topic of the manuscript, filtering them (e.g. excluding those with potential conflict of interest), and ranking them based on several metrics configured by the editor (user). The framework extracts the required information for the recommendation process from the online resources on-the-fly which ensures the output recommendations to be dynamic and based on up-to-date information

    SUPER: Social-based business process management framework

    Get PDF
    © Springer International Publishing Switzerland 2015. In this demo paper, we present SUPER standing for Socialbased bUsiness Process managEment fRamework that leverages social computing principles for the design and development of social business processes (aka business processes 2.0). SUPER identifies task, person, and machine as the core components of a business process. Afterwards, SUPER establishes a set of execution and social relations to illustrate how tasks (also persons and machines) are connected together. The social relations help build configuration network of tasks, social network of persons, and support network of machines that capture the ongoing interactions during business process execution
    • …
    corecore