2,610 research outputs found
Optimization of Analytic Window Functions
Analytic functions represent the state-of-the-art way of performing complex
data analysis within a single SQL statement. In particular, an important class
of analytic functions that has been frequently used in commercial systems to
support OLAP and decision support applications is the class of window
functions. A window function returns for each input tuple a value derived from
applying a function over a window of neighboring tuples. However, existing
window function evaluation approaches are based on a naive sorting scheme. In
this paper, we study the problem of optimizing the evaluation of window
functions. We propose several efficient techniques, and identify optimization
opportunities that allow us to optimize the evaluation of a set of window
functions. We have integrated our scheme into PostgreSQL. Our comprehensive
experimental study on the TPC-DS datasets as well as synthetic datasets and
queries demonstrate significant speedup over existing approaches.Comment: VLDB201
Recommended from our members
Improving Performance of M-to-N Processing and Data Redistribution in In Transit Analysis and Visualization
In an in transit setting, a parallel data producer, such as a numerical simulation, runs on one set of ranks M, while a data consumer, such as a parallel visualization application, runs on a different set of ranks N. One of the central challenges in this in transit setting is to determine the mapping of data from the set of M producer ranks to the set of N consumer ranks. This is a challenging problem for several reasons, such as the producer and consumer codes potentially having different scaling characteristics and different data models. The resulting mapping from M to N ranks can have a significant impact on aggregate application performance. In this work, we present an approach for performing this M-to-N mapping in a way that has broad applicability across a diversity of data producer and consumer applications. We evaluate its design and performance with
a study that runs at high concurrency on a modern HPC platform. By leveraging design characteristics, which facilitate an “intelligent” mapping from M-to-N, we observe significant performance gains are possible in terms of several different metrics, including time-to-solution and amount of data moved
Gunrock: GPU Graph Analytics
For large-scale graph analytics on the GPU, the irregularity of data access
and control flow, and the complexity of programming GPUs, have presented two
significant challenges to developing a programmable high-performance graph
library. "Gunrock", our graph-processing system designed specifically for the
GPU, uses a high-level, bulk-synchronous, data-centric abstraction focused on
operations on a vertex or edge frontier. Gunrock achieves a balance between
performance and expressiveness by coupling high performance GPU computing
primitives and optimization strategies with a high-level programming model that
allows programmers to quickly develop new graph primitives with small code size
and minimal GPU programming knowledge. We characterize the performance of
various optimization strategies and evaluate Gunrock's overall performance on
different GPU architectures on a wide range of graph primitives that span from
traversal-based algorithms and ranking algorithms, to triangle counting and
bipartite-graph-based algorithms. The results show that on a single GPU,
Gunrock has on average at least an order of magnitude speedup over Boost and
PowerGraph, comparable performance to the fastest GPU hardwired primitives and
CPU shared-memory graph libraries such as Ligra and Galois, and better
performance than any other GPU high-level graph library.Comment: 52 pages, invited paper to ACM Transactions on Parallel Computing
(TOPC), an extended version of PPoPP'16 paper "Gunrock: A High-Performance
Graph Processing Library on the GPU
An optimized cost-based data allocation model for heterogeneous distributed computing systems
Continuous attempts have been made to improve the flexibility and effectiveness of distributed computing systems. Extensive effort in the fields of connectivity technologies, network programs, high processing components, and storage helps to improvise results. However, concerns such as slowness in response, long execution time, and long completion time have been identified as stumbling blocks that hinder performance and require additional attention. These defects increased the total system cost and made the data allocation procedure for a geographically dispersed setup difficult. The load-based architectural model has been strengthened to improve data allocation performance. To do this, an abstract job model is employed, and a data query file containing input data is processed on a directed acyclic graph. The jobs are executed on the processing engine with the lowest execution cost, and the system's total cost is calculated. The total cost is computed by summing the costs of communication, computation, and network. The total cost of the system will be reduced using a Swarm intelligence algorithm. In heterogeneous distributed computing systems, the suggested approach attempts to reduce the system's total cost and improve data distribution. According to simulation results, the technique efficiently lowers total system cost and optimizes partitioned data allocation
Cooperating intelligent systems
Some of the issues connected to the development of a bureaucratic system are discussed. Emphasis is on a layer multiagent approach to distributed artificial intelligence (DAI). The division of labor in a bureaucracy is considered. The bureaucratic model seems to be a fertile model for further examination since it allows for the growth and change of system components and system protocols and rules. The first part of implementing the system would be the construction of a frame based reasoner and the appropriate B-agents and E-agents. The agents themselves should act as objects and the E-objects in particular should have the capability of taking on a different role. No effort was made to address the problems of automated failure recovery, problem decomposition, or implementation. Instead what has been achieved is a framework that can be developed in several distinct ways, and which provides a core set of metaphors and issues for further research
- …