Search CORE

2,610 research outputs found

Optimization of Analytic Window Functions

Author: Cao Yu
Chan Chee-Yong
Li Jie
Tan Kian-Lee
Publication venue
Publication date: 01/01/2012
Field of study

Analytic functions represent the state-of-the-art way of performing complex data analysis within a single SQL statement. In particular, an important class of analytic functions that has been frequently used in commercial systems to support OLAP and decision support applications is the class of window functions. A window function returns for each input tuple a value derived from applying a function over a window of neighboring tuples. However, existing window function evaluation approaches are based on a naive sorting scheme. In this paper, we study the problem of optimizing the evaluation of window functions. We propose several efficient techniques, and identify optimization opportunities that allow us to optimize the evaluation of a set of window functions. We have integrated our scheme into PostgreSQL. Our comprehensive experimental study on the TPC-DS datasets as well as synthetic datasets and queries demonstrate significant speedup over existing approaches.Comment: VLDB201

arXiv.org e-Print Archive

ScholarBank@NUS

Recommended from our members

Improving Performance of M-to-N Processing and Data Redistribution in In Transit Analysis and Visualization

Author: Bethel E Wes
Ferrier Nicola
Gu Junmin
Kress James
Logan Jeremey
Loring Burlen
Rizzi Silvio
Shudler Sergei
Wolf Matthew
Publication venue: eScholarship, University of California
Publication date: 25/05/2020
Field of study

In an in transit setting, a parallel data producer, such as a numerical simulation, runs on one set of ranks M, while a data consumer, such as a parallel visualization application, runs on a different set of ranks N. One of the central challenges in this in transit setting is to determine the mapping of data from the set of M producer ranks to the set of N consumer ranks. This is a challenging problem for several reasons, such as the producer and consumer codes potentially having different scaling characteristics and different data models. The resulting mapping from M to N ranks can have a significant impact on aggregate application performance. In this work, we present an approach for performing this M-to-N mapping in a way that has broad applicability across a diversity of data producer and consumer applications. We evaluate its design and performance with a study that runs at high concurrency on a modern HPC platform. By leveraging design characteristics, which facilitate an “intelligent” mapping from M-to-N, we observe significant performance gains are possible in terms of several different metrics, including time-to-solution and amount of data moved

eScholarship - University of California

Gunrock: GPU Graph Analytics

Author: Davidson Andrew
Liu Weitang
Osama Muhammad
Owens John D.
Pan Yuechao
Riffel Andy T.
Wang Leyuan
Wang Yangzihao
Wu Yuduo
Yang Carl
Yuan Chenshan
Publication venue
Publication date: 04/01/2017
Field of study

For large-scale graph analytics on the GPU, the irregularity of data access and control flow, and the complexity of programming GPUs, have presented two significant challenges to developing a programmable high-performance graph library. "Gunrock", our graph-processing system designed specifically for the GPU, uses a high-level, bulk-synchronous, data-centric abstraction focused on operations on a vertex or edge frontier. Gunrock achieves a balance between performance and expressiveness by coupling high performance GPU computing primitives and optimization strategies with a high-level programming model that allows programmers to quickly develop new graph primitives with small code size and minimal GPU programming knowledge. We characterize the performance of various optimization strategies and evaluate Gunrock's overall performance on different GPU architectures on a wide range of graph primitives that span from traversal-based algorithms and ranking algorithms, to triangle counting and bipartite-graph-based algorithms. The results show that on a single GPU, Gunrock has on average at least an order of magnitude speedup over Boost and PowerGraph, comparable performance to the fastest GPU hardwired primitives and CPU shared-memory graph libraries such as Ligra and Galois, and better performance than any other GPU high-level graph library.Comment: 52 pages, invited paper to ACM Transactions on Parallel Computing (TOPC), an extended version of PPoPP'16 paper "Gunrock: A High-Performance Graph Processing Library on the GPU

arXiv.org e-Print Archive

eScholarship - University of California

FigShare

The Design and Implementation of Modern Column-Oriented Database Systems

Author: Abadi D.
Boncz P.A. (Peter)
Harizopoulos S.
Idreos S. (Stratos)
Madden S. (Samuel)
Publication venue: 'Now Publishers'
Publication date: 01/12/2013
Field of study

CWI's Institutional Repository

An optimized cost-based data allocation model for heterogeneous distributed computing systems

Author: Batth Ranbir Singh
Kaur Sukhpreet
Kumar Dubey Mithilesh
Tarun Sashi
Publication venue: Institute of Advanced Engineering and Science
Publication date: 01/12/2022
Field of study

Continuous attempts have been made to improve the flexibility and effectiveness of distributed computing systems. Extensive effort in the fields of connectivity technologies, network programs, high processing components, and storage helps to improvise results. However, concerns such as slowness in response, long execution time, and long completion time have been identified as stumbling blocks that hinder performance and require additional attention. These defects increased the total system cost and made the data allocation procedure for a geographically dispersed setup difficult. The load-based architectural model has been strengthened to improve data allocation performance. To do this, an abstract job model is employed, and a data query file containing input data is processed on a directed acyclic graph. The jobs are executed on the processing engine with the lowest execution cost, and the system's total cost is calculated. The total cost is computed by summing the costs of communication, computation, and network. The total cost of the system will be reduced using a Swarm intelligence algorithm. In heterogeneous distributed computing systems, the suggested approach attempts to reduce the system's total cost and improve data distribution. According to simulation results, the technique efficiently lowers total system cost and optimizes partitioned data allocation

ZENODO

NEUROSURGERY ENTHUSIASTIC WOMEN SOCIETY

Institute of Advanced Engineering and Science

Cooperating intelligent systems

Author: Rochowiak Daniel
Publication venue
Publication date
Field of study

Some of the issues connected to the development of a bureaucratic system are discussed. Emphasis is on a layer multiagent approach to distributed artificial intelligence (DAI). The division of labor in a bureaucracy is considered. The bureaucratic model seems to be a fertile model for further examination since it allows for the growth and change of system components and system protocols and rules. The first part of implementing the system would be the construction of a frame based reasoner and the appropriate B-agents and E-agents. The agents themselves should act as objects and the E-objects in particular should have the capability of taking on a different role. No effort was made to address the problems of automated failure recovery, problem decomposition, or implementation. Instead what has been achieved is a framework that can be developed in several distinct ways, and which provides a core set of metaphors and issues for further research

NASA Technical Reports Server