266 research outputs found
The Modeling of the ERP Systems within Parallel Calculus
As we know from a few years, the basic characteristics of ERP systems are: modular-design, central common database, integration of the modules, data transfer between modules done automatically, complex systems and flexible configuration. Because this, is obviously a parallel approach to design and implement them within parallel algorithms, parallel calculus and distributed databases. This paper aims to support these assertions and provide a model, in summary, what could be an ERP system based on parallel computing and algorithms.ERP Systems, Modeling, Parallel Calculus, Incremental Model
SaLoBa: Maximizing Data Locality and Workload Balance for Fast Sequence Alignment on GPUs
Sequence alignment forms an important backbone in many sequencing
applications. A commonly used strategy for sequence alignment is an approximate
string matching with a two-dimensional dynamic programming approach. Although
some prior work has been conducted on GPU acceleration of a sequence alignment,
we identify several shortcomings that limit exploiting the full computational
capability of modern GPUs. This paper presents SaLoBa, a GPU-accelerated
sequence alignment library focused on seed extension. Based on the analysis of
previous work with real-world sequencing data, we propose techniques to exploit
the data locality and improve workload balancing. The experimental results
reveal that SaLoBa significantly improves the seed extension kernel compared to
state-of-the-art GPU-based methods.Comment: Published at IPDPS'2
Controlling Disk Contention for Parallel Query Processing in Shared Disk Database Systems
Shared Disk database systems offer a high flexibility for parallel transaction and query processing. This is because each node can process any transaction, query or subquery because it has access to the entire database. Compared to Shared Nothing, this is particularly advantageous for scan queries for which the degree of intra-query parallelism as well as the scan processors themselves can dynamically be chosen. On the other hand, there is the danger of disk contention between subqueries, in particular for index scans. We present a detailed simulation study to analyze the effectiveness of parallel scan processing in Shared Disk database systems. In particular, we investigate the relationship between the degree of declustering and the degree of scan parallelism for relation scans, clustered index scans, and non-clustered index scans. Furthermore, we study the usefulness of disk caches and prefetching for limiting disk contention. Finally, we show the importance of dynamically choosing the degree of scan parallelism to control disk contention in multi-user mode
Analysis of parallel scan processing in Shared Disk database systems
Shared Disk database systems offer a high flexibility for parallel transaction and query processing. This is because each node can process any transaction, query or subquery because it has access to the entire database. Compared to Shared Nothing database systems, this is particularly advantageous for scan queries for which the degree of intra-query parallelism as well as the scan processors themselves can dynamically be chosen. On the other hand, there is the danger of disk contention between subqueries, in particular for index scans. We present a detailed simulation study to analyze the effectiveness of parallel scan processing in Shared Disk database systems. In particular, we investigate the relationship between the degree of declustering and the degree of scan parallelism for relation scans, clustered index scans, and non-clustered index scans. Furthermore, we study the usefulness of disk caches and prefetching for limiting disk contention. Finally, we show that disk contention in multi-user mode can be limited for Shared Disk database systems by dynamically choosing the degree of scan parallelism
Adaptive query parallelization in multi-core column stores
With the rise of multi-core CPU platforms, their optimal utilization
for in-memory OLAP workloads using column store databases has
become one of the biggest challenges. Some of the inherent limi-
tations in the achievable query parallelism are due to the degree of
parallelism dependency on the data skew, the overheads incurred by
thread coordination, and the hardware resource limits. Finding the
right balance between the degree of parallelism and the multi-core
utilizati
A comparative analysis of leading relational database management systems
http://deepblue.lib.umich.edu/bitstream/2027.42/96903/1/MBA_JayaramanS_1996Final.pd
Integrating Scale Out and Fault Tolerance in Stream Processing using Operator State Management
As users of big data applications expect fresh results, we witness a new breed of stream processing systems (SPS) that are designed to scale to large numbers of cloud-hosted machines. Such systems face new challenges: (i) to benefit from the pay-as-you-go model of cloud computing, they must scale out on demand, acquiring additional virtual machines (VMs) and parallelising operators when the workload increases; (ii) failures are common with deployments on hundreds of VMs - systems must be fault-tolerant with fast recovery times, yet low per-machine overheads. An open question is how to achieve these two goals when stream queries include stateful operators, which must be scaled out and recovered without affecting query results. Our key idea is to expose internal operator state explicitly to the SPS through a set of state management primitives. Based on them, we describe an integrated approach for dynamic scale out and recovery of stateful operators. Externalised operator state is checkpointed periodically by the SPS and backed up to upstream VMs. The SPS identifies individual operator bottlenecks and automatically scales them out by allocating new VMs and partitioning the check-pointed state. At any point, failed operators are recovered by restoring checkpointed state on a new VM and replaying unprocessed tuples. We evaluate this approach with the Linear Road Benchmark on the Amazon EC2 cloud platform and show that it can scale automatically to a load factor of L=350 with 50 VMs, while recovering quickly from failures. Copyright © 2013 ACM
Recommended from our members
Parallel computing in information retrieval - An updated review
The progress of parallel computing in Information Retrieval (IR) is reviewed. In particular we stress the importance of the motivation in using parallel computing for Text Retrieval. We analyse parallel IR systems using a classification due to Rasmussen [1] and describe some parallel IR systems. We give a description of the retrieval models used in parallel Information Processing.. We describe areas of research which we believe are needed
Accessing very high dimensional spaces in parallel
Access methods are a fundamental tool on Information Retrieval. However,
most of these methods suffer the problem known as the curse of dimensionality when
they are applied to objects with very high dimensionality representation spaces, such
as text documents. In this paper we introduce a new parallel access method that uses
several graphs as distributed index structure and a kNN search algorithm. Two parallel
versions of the search method are presented, one based on masterâslave scheme and
the other based on a pipeline. A thorough experimental analysis on different datasets
shows that our method can process efficiently large flows of queries, compete with
other parallel algorithms and obtain at the same time very high quality results.This research has been supported by the CICYT project TIN2014-53495-R of the
Ministerio de EconomĂa y Competitividad
- âŠ