3,529 research outputs found
Scalable RDF Data Compression using X10
The Semantic Web comprises enormous volumes of semi-structured data elements.
For interoperability, these elements are represented by long strings. Such
representations are not efficient for the purposes of Semantic Web applications
that perform computations over large volumes of information. A typical method
for alleviating the impact of this problem is through the use of compression
methods that produce more compact representations of the data. The use of
dictionary encoding for this purpose is particularly prevalent in Semantic Web
database systems. However, centralized implementations present performance
bottlenecks, giving rise to the need for scalable, efficient distributed
encoding schemes. In this paper, we describe an encoding implementation based
on the asynchronous partitioned global address space (APGAS) parallel
programming model. We evaluate performance on a cluster of up to 384 cores and
datasets of up to 11 billion triples (1.9 TB). Compared to the state-of-art
MapReduce algorithm, we demonstrate a speedup of 2.6-7.4x and excellent
scalability. These results illustrate the strong potential of the APGAS model
for efficient implementation of dictionary encoding and contributes to the
engineering of larger scale Semantic Web applications
Energy-efficient distributed password hash computation on heterogeneous embedded system
This paper presents the improved version of our cool Cracker cluster (cCc), a heterogeneous distributed system for parallel and energy-efficient bcrypt password hash computation. The cluster consists of up to 8 computational units (nodes) with different performances measured in bcrypt hash computations per second [H/s]. In the cluster, nodes are low-power heterogeneous embedded systems with programmable logic containing specialized hash computation accelerators. In the experiments, we used a combination of Xilinx Zynq-series SoC boards and ZTEX 1.15y board which was initially used as a bitcoin miner. Zynq based nodes use the improved version of our custom bcrypt accelerator, which executes the most costly parts of the bcrypt hash computation in programmable logic. The cluster was formed around the famous open-source password cracking software package John the Ripper (abbr. JtR). On the communication layer, we used Message Passing Interface (MPI)library with a standard Ethernet network connecting the nodes. To mitigate the different performances among the cluster nodes and to balance the load, we developed and implemented password candidate distribution scheme based on the passwords\u27 probability distribution, i.e. the order of appearance in the dictionary. We tested individual nodes and the cluster as a whole, trying different combinations of nodes and evaluating our distribution scheme for password candidates. We also compared our cluster with various GPU implementations in terms of performance, energy-efficiency, and price-efficiency. We show that our solution outperforms other platforms such as high-end GPUs, by a factor of at least 3 in terms of energy-efficiency and thus producing less overall cost of password attack than other platforms. In terms of the total operational costs, our cluster pays off after 4500 cracked passwords for a bcrypt hash with cost parameter 12, which makes it more appealing for real-world password-based system attacks. We also demonstrate the scalability of our cCc cluster
Fast and Simple Compact Hashing via Bucketing
Compact hash tables store a set S of n key-value pairs, where the keys are from the universe U = {0, ..., u - 1}, and the values are v-bit integers, in close to B(u, n) + nv bits of space, where B(u, n) = log2 ((u)(n)) is the information-theoretic lower bound for representing the set of keys in S, and support operations insert, delete and lookup on S. Compact hash tables have received significant attention in recent years, and approaches dating back to Cleary [IEEE T. Comput, 1984], as well as more recent ones have been implemented and used in a number of applications. However, the wins on space usage of these approaches are outweighed by their slowness relative to conventional hash tables. In this paper, we demonstrate that compact hash tables based upon a simple idea of bucketing practically outperform existing compact hash table implementations in terms of memory usage and construction time, and existing fast hash table implementations in terms of memory usage (and sometimes also in terms of construction time), while having competitive query times. A related notion is that of a compact hash ID map, which stores a set (S) over cap of n keys from U, and implicitly associates each key in (S) over cap with a unique value (its ID), chosen by the data structure itself, which is an integer of magnitude O(n), and supports inserts and lookups on S, while using space close to B(u, n) bits. One of our approaches is suitable for use as a compact hash ID map.Peer reviewe
- ā¦