Search CORE

3,529 research outputs found

Scalable RDF Data Compression using X10

Author: Cheng Long
Kotoulas Spyros
Malik Avinash
Theodoropoulos Georgios
Ward Tomas E
Publication venue
Publication date: 01/01/2014
Field of study

The Semantic Web comprises enormous volumes of semi-structured data elements. For interoperability, these elements are represented by long strings. Such representations are not efficient for the purposes of Semantic Web applications that perform computations over large volumes of information. A typical method for alleviating the impact of this problem is through the use of compression methods that produce more compact representations of the data. The use of dictionary encoding for this purpose is particularly prevalent in Semantic Web database systems. However, centralized implementations present performance bottlenecks, giving rise to the need for scalable, efficient distributed encoding schemes. In this paper, we describe an encoding implementation based on the asynchronous partitioned global address space (APGAS) parallel programming model. We evaluate performance on a cluster of up to 384 cores and datasets of up to 11 billion triples (1.9 TB). Compared to the state-of-art MapReduce algorithm, we demonstrate a speedup of 2.6-7.4x and excellent scalability. These results illustrate the strong potential of the APGAS model for efficient implementation of dictionary encoding and contributes to the engineering of larger scale Semantic Web applications

arXiv.org e-Print Archive

MURAL - Maynooth University Research Archive Library

NUI Maynooth Eprint Archive

Maynooth University ePrints and eTheses Archive

Energy-efficient distributed password hash computation on heterogeneous embedded system

Author: Guberović Emanuel
Knezović Josip
Pervan Branimir
Publication venue: 'Informa UK Limited'
Publication date: 01/01/2022
Field of study

This paper presents the improved version of our cool Cracker cluster (cCc), a heterogeneous distributed system for parallel and energy-efficient bcrypt password hash computation. The cluster consists of up to 8 computational units (nodes) with different performances measured in bcrypt hash computations per second [H/s]. In the cluster, nodes are low-power heterogeneous embedded systems with programmable logic containing specialized hash computation accelerators. In the experiments, we used a combination of Xilinx Zynq-series SoC boards and ZTEX 1.15y board which was initially used as a bitcoin miner. Zynq based nodes use the improved version of our custom bcrypt accelerator, which executes the most costly parts of the bcrypt hash computation in programmable logic. The cluster was formed around the famous open-source password cracking software package John the Ripper (abbr. JtR). On the communication layer, we used Message Passing Interface (MPI)library with a standard Ethernet network connecting the nodes. To mitigate the different performances among the cluster nodes and to balance the load, we developed and implemented password candidate distribution scheme based on the passwords\u27 probability distribution, i.e. the order of appearance in the dictionary. We tested individual nodes and the cluster as a whole, trying different combinations of nodes and evaluating our distribution scheme for password candidates. We also compared our cluster with various GPU implementations in terms of performance, energy-efficiency, and price-efficiency. We show that our solution outperforms other platforms such as high-end GPUs, by a factor of at least 3 in terms of energy-efficiency and thus producing less overall cost of password attack than other platforms. In terms of the total operational costs, our cluster pays off after 4500 cracked passwords for a bcrypt hash with cost parameter 12, which makes it more appealing for real-world password-based system attacks. We also demonstrate the scalability of our cCc cluster

HRČAK - Portal of Croatian Scientific and Professional Journals

Hrčak - Portal of scientific journals of Croatia

Fast and Simple Compact Hashing via Bucketing

Author: Koppl Dominik
Puglisi Simon J.
Raman Rajeev
Publication venue
Publication date: 01/09/2022
Field of study

Compact hash tables store a set S of n key-value pairs, where the keys are from the universe U = {0, ..., u - 1}, and the values are v-bit integers, in close to B(u, n) + nv bits of space, where B(u, n) = log2 ((u)(n)) is the information-theoretic lower bound for representing the set of keys in S, and support operations insert, delete and lookup on S. Compact hash tables have received significant attention in recent years, and approaches dating back to Cleary [IEEE T. Comput, 1984], as well as more recent ones have been implemented and used in a number of applications. However, the wins on space usage of these approaches are outweighed by their slowness relative to conventional hash tables. In this paper, we demonstrate that compact hash tables based upon a simple idea of bucketing practically outperform existing compact hash table implementations in terms of memory usage and construction time, and existing fast hash table implementations in terms of memory usage (and sometimes also in terms of construction time), while having competitive query times. A related notion is that of a compact hash ID map, which stores a set (S) over cap of n keys from U, and implicitly associates each key in (S) over cap with a unique value (its ID), chosen by the data structure itself, which is an integer of magnitude O(n), and supports inserts and lookups on S, while using space close to B(u, n) bits. One of our approaches is suitable for use as a compact hash ID map.Peer reviewe

Helsingin yliopiston digitaalinen arkisto