155 research outputs found
The Family of MapReduce and Large Scale Data Processing Systems
In the last two decades, the continuous increase of computational power has
produced an overwhelming flow of data which has called for a paradigm shift in
the computing architecture and large scale data processing mechanisms.
MapReduce is a simple and powerful programming model that enables easy
development of scalable parallel applications to process vast amounts of data
on large clusters of commodity machines. It isolates the application from the
details of running a distributed program such as issues on data distribution,
scheduling and fault tolerance. However, the original implementation of the
MapReduce framework had some limitations that have been tackled by many
research efforts in several followup works after its introduction. This article
provides a comprehensive survey for a family of approaches and mechanisms of
large scale data processing mechanisms that have been implemented based on the
original idea of the MapReduce framework and are currently gaining a lot of
momentum in both research and industrial communities. We also cover a set of
introduced systems that have been implemented to provide declarative
programming interfaces on top of the MapReduce framework. In addition, we
review several large scale data processing systems that resemble some of the
ideas of the MapReduce framework for different purposes and application
scenarios. Finally, we discuss some of the future research directions for
implementing the next generation of MapReduce-like solutions.Comment: arXiv admin note: text overlap with arXiv:1105.4252 by other author
The perimeter of uniform and geometric words: a probabilistic analysis
Let a word be a sequence of i.i.d. integer random variables. The
perimeter of the word is the number of edges of the word, seen as a
polyomino. In this paper, we present a probabilistic approach to the
computation of the moments of . This is applied to uniform and geometric
random variables. We also show that, asymptotically, the distribution of is
Gaussian and, seen as a stochastic process, the perimeter converges in
distribution to a Brownian motionComment: 13 pages, 7 figure
MINARET: A Recommendation Framework for Scientific Reviewers
International audienceWe are witnessing a continuous growth in the size of scientific communities and the number of scientific publications. This phenomenon requires a continuous effort for ensuring the quality of publications and a healthy scientific evaluation process. Peer reviewing is the de facto mechanism to assess the quality of scientific work. For journal editors, managing an efficient and effective manuscript peer review process is not a straightforward task. In particular, a main component in the journal editors' role is, for each submitted manuscript, to ensure selecting adequate reviewers who need to be: 1) Matching on their research interests with the topic of the submission, 2) Fair in their evaluation of the submission, i.e., no conflict of interest with the authors, 3) Qualified in terms of various aspects including scientific impact, previous review/authorship experience for the journal , quality of the reviews, etc. Thus, manually selecting and assessing the adequate reviewers is becoming tedious and time consuming task. We demonstrate MINARET, a recommendation framework for selecting scientific reviewers. The framework facilitates the job of journal editors for conducting an efficient and effective scientific review process. The framework exploits the valuable information available on the modern scholarly Websites (e.g., Google Scholar, ACM DL, DBLP, Publons) for identifying candidate reviewers relevant to the topic of the manuscript, filtering them (e.g. excluding those with potential conflict of interest), and ranking them based on several metrics configured by the editor (user). The framework extracts the required information for the recommendation process from the online resources on-the-fly which ensures the output recommendations to be dynamic and based on up-to-date information
SUPER: Social-based business process management framework
© Springer International Publishing Switzerland 2015. In this demo paper, we present SUPER standing for Socialbased bUsiness Process managEment fRamework that leverages social computing principles for the design and development of social business processes (aka business processes 2.0). SUPER identifies task, person, and machine as the core components of a business process. Afterwards, SUPER establishes a set of execution and social relations to illustrate how tasks (also persons and machines) are connected together. The social relations help build configuration network of tasks, social network of persons, and support network of machines that capture the ongoing interactions during business process execution
- …