Search CORE

571 research outputs found

Deterministic load balancing and dictionaries in the parallel disk model

Author
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2006
Field of study

Full-speed scalability of the pDomus platform for DHTs

Author: Alves Albano
Exposto José
Lopes Rui Pedro
Pina António
Rufino José
Publication venue: 'ACTA Press'
Publication date: 01/01/2007
Field of study

Domus is an architecture for Distributed Hash Tables (DHTs) tailored to a shared-all cluster environment. Domus DHTs build on a (dynamic) set of cluster nodes; each node may perform routing and/or storage tasks, for one or more DHTs, as a function of the node base (static) resources and of its (dynamic) state. Domus DHTs also benefit from a rich set of user-level attributes and operations. pDomus is a prototype of Domus that creates an environment where to evaluate the architecture concepts and features. In this paper, we present a set of experiments conduced to obtain figures of merit on the scalability of a specific DHT operation, with several lookup methods and storage technologies. The evaluation also involves a comparison with a database and a P2P-oriented DHT platform. The results are promising, and a motivation for further work.PRODEP III (grant 5.3/N/199.006/00)SAPIENS (grant 41739/CHS/2001

Biblioteca Digital do IPB

Algorithm Libraries for Multi-Core Processors

Author: Singler Johannes
Publication venue: KIT-Bibliothek, Karlsruhe
Publication date: 01/01/2010
Field of study

By providing parallelized versions of established algorithm libraries, we ease the exploitation of the multiple cores on modern processors for the programmer. The Multi-Core STL provides basic algorithms for internal memory, while the parallelized STXXL enables multi-core acceleration for algorithms on large data sets stored on disk. Some parallelized geometric algorithms are introduced into CGAL. Further, we design and implement sorting algorithms for huge data in distributed external memory

KITopen

The Family of MapReduce and Large Scale Data Processing Systems

Author: Anna Liu
Ayman G. Fayoumi
King Abdulaziz
See Profile
Sherif Sakr
Sherif Sakr
South Wales
South Wales
Publication venue
Publication date: 12/02/2013
Field of study

In the last two decades, the continuous increase of computational power has produced an overwhelming flow of data which has called for a paradigm shift in the computing architecture and large scale data processing mechanisms. MapReduce is a simple and powerful programming model that enables easy development of scalable parallel applications to process vast amounts of data on large clusters of commodity machines. It isolates the application from the details of running a distributed program such as issues on data distribution, scheduling and fault tolerance. However, the original implementation of the MapReduce framework had some limitations that have been tackled by many research efforts in several followup works after its introduction. This article provides a comprehensive survey for a family of approaches and mechanisms of large scale data processing mechanisms that have been implemented based on the original idea of the MapReduce framework and are currently gaining a lot of momentum in both research and industrial communities. We also cover a set of introduced systems that have been implemented to provide declarative programming interfaces on top of the MapReduce framework. In addition, we review several large scale data processing systems that resemble some of the ideas of the MapReduce framework for different purposes and application scenarios. Finally, we discuss some of the future research directions for implementing the next generation of MapReduce-like solutions.Comment: arXiv admin note: text overlap with arXiv:1105.4252 by other author

arXiv.org e-Print Archive

CiteSeerX

dsmcFoam+: An OpenFOAM based direct simulation Monte Carlo solver

Author: Ahrens
Alexeenko
Allègre
B. John
Bird
Bird
Bird
Bird
Borg
Borg
Borgnakke
C. White
D.R. Emerson
Dongari
Dongari
Ewart
Gad-el-Hak
Garcia
Hadjiconstantinou
Hirsch
Ivanov
J.M. Reese
John
Karniadakis
Liechty
Lofthouse
M.K. Borg
Macpherson
Macpherson
Macpherson
Macpherson
Maxwell
Padilla
Palharini
S.M. Longshaw
Scanlon
Scanlon
Schaaf
Smoluchowski
Steckelmacher
Stefanov
Su
Sun
T.J. Scanlon
Wagner
Weller
White
White
White
Publication venue: 'Elsevier BV'
Publication date: 01/01/2017
Field of study

dsmcFoam+ is a direct simulation Monte Carlo (DSMC) solver for rarefied gas dynamics, implemented within the OpenFOAM software framework, and parallelised with MPI. It is open-source and released under the GNU General Public License in a publicly available software repository that includes detailed documentation and tutorial DSMC gas flow cases. This release of the code includes many features not found in standard dsmcFoam, such as molecular vibrational and electronic energy modes, chemical reactions, and subsonic pressure boundary conditions. Since dsmcFoam+ is designed entirely within OpenFOAM’s C++ object-oriented framework, it benefits from a number of key features: the code emphasises extensibility and flexibility so it is aimed first and foremost as a research tool for DSMC, allowing new models and test cases to be developed and tested rapidly. All DSMC cases are as straightforward as setting up any standard OpenFOAM case, as dsmcFoam+ relies upon the standard OpenFOAM dictionary based directory structure. This ensures that useful pre- and post-processing capabilities provided by OpenFOAM remain available even though the fully Lagrangian nature of a DSMC simulation is not typical of most OpenFOAM applications. We show that dsmcFoam+ compares well to other well-known DSMC codes and to analytical solutions in terms of benchmark results

Crossref

University of Strathclyde Institutional Repository

Edinburgh Research Explorer

Enlighten

ePubs: the open archive for STFC research publications

MergedTrie: Efficient textual indexing

Author: Ferrández Antonio
Peral Jesús
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2019
Field of study

The accessing and processing of textual information (i.e. the storing and querying of a set of strings) is especially important for many current applications (e.g. information retrieval and social networks), especially when working in the fields of Big Data or IoT, which require the handling of very large string dictionaries. Typical data structures for textual indexing are Hash Tables and some variants of Tries such as the Double Trie (DT). In this paper, we propose an extension of the DT that we have called MergedTrie. It improves the DT compression by merging both Tries into a single and by segmenting the indexed term into two fixed length parts in order to balance the new Trie. Thus, a higher overlapping of both prefixes and suffixes is obtained. Moreover, we propose a new implementation of Tries that achieves better compression rates than the Double-Array representation usually chosen for implementing Tries. Our proposal also overcomes the limitation of static implementations that does not allow insertions and updates in their compact representations. Finally, our MergedTrie implementation experimentally improves the efficiency of the Hash Tables, the DTs, the Double-Array, the Crit-bit, the Directed Acyclic Word Graphs (DAWG), and the Acyclic Deterministic Finite Automata (ADFA) data structures, requiring less space than the original text to be indexed.This study has been partially funded by the SEQUOIA-UA (TIN2015-63502-C3-3-R) and the RESCATA (TIN2015-65100-R) projects of the Spanish Ministry of Economy and Competitiveness (MINECO)

Repositorio Institucional de la Universidad de Alicante

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Directory of Open Access Journals

Engineering Aggregation Operators for Relational In-Memory Database Systems

Author: Müller Ingo
Publication venue: KIT-Bibliothek, Karlsruhe
Publication date: 01/01/2016
Field of study

In this thesis we study the design and implementation of Aggregation operators in the context of relational in-memory database systems. In particular, we identify and address the following challenges: cache-efficiency, CPU-friendliness, parallelism within and across processors, robust handling of skewed data, adaptive processing, processing with constrained memory, and integration with modern database architectures. Our resulting algorithm outperforms the state-of-the-art by up to 3.7x

KITopen