Search CORE

26,119 research outputs found

Towards a Holistic Integration of Spreadsheets with Databases: A Scalable Storage Engine for Presentational Data Management

Author: Bendre Mangesh
Chang Kevin
Parameswaran Aditya
Venkataraman Vipul
Zhou Xinyan
Publication venue
Publication date: 05/10/2017
Field of study

Spreadsheet software is the tool of choice for interactive ad-hoc data management, with adoption by billions of users. However, spreadsheets are not scalable, unlike database systems. On the other hand, database systems, while highly scalable, do not support interactivity as a first-class primitive. We are developing DataSpread, to holistically integrate spreadsheets as a front-end interface with databases as a back-end datastore, providing scalability to spreadsheets, and interactivity to databases, an integration we term presentational data management (PDM). In this paper, we make a first step towards this vision: developing a storage engine for PDM, studying how to flexibly represent spreadsheet data within a database and how to support and maintain access by position. We first conduct an extensive survey of spreadsheet use to motivate our functional requirements for a storage engine for PDM. We develop a natural set of mechanisms for flexibly representing spreadsheet data and demonstrate that identifying the optimal representation is NP-Hard; however, we develop an efficient approach to identify the optimal representation from an important and intuitive subclass of representations. We extend our mechanisms with positional access mechanisms that don't suffer from cascading update issues, leading to constant time access and modification performance. We evaluate these representations on a workload of typical spreadsheets and spreadsheet operations, providing up to 20% reduction in storage, and up to 50% reduction in formula evaluation time

arXiv.org e-Print Archive

Crossref

SWAP Version 3.2. Theory description and user manual

Author: Dam J.C., van
Groenendijk P.
Hendriks R.F.A.
Jacobs C.M.J.
Kroes J.G.
Publication venue: Alterra
Publication date
Field of study

SWAP 3.2 simulates transport of water, solutes and heat in the vadose zone. It describes a domain from the top of canopy into the groundwater which may be in interaction with a surface water system. The program has been developed by Alterra and Wageningen University, and is designed to simulate transport processes at field scale and during whole growing seasons. This is a new release with special emphasis on numerical stability, macro pore flow, and options for detailed meteorological input and linkage to other models. This manual describes the theoretical background, model use, input requirements and output tables

Wageningen University & Research Publications

Dynamic data transformation for low latency querying in big data systems

Author: De Turck Filip
Ordonez Ante Leandro
Van Seghbroeck Gregory
Vanhove Thomas
Volckaert Bruno
Wauters Tim
Publication venue
Publication date: 01/01/2017
Field of study

Ghent University Academic Bibliography

Ecology and co-existence of two endemic day gecko (Phelsuma) species in Seychelles native palm forest

Author: Bassett
Crawford
Edwards
Fleischmann
Gerlach
Hansen
Hansen
Harmon
Kaiser-Bunbury
Kier
Losos
MacArthur
Myers
Olesen
Pianka
Procter
Procter
Proctor
Radtkey
Raxworthy
Rocha
Rodda
Savage
Schoener
Schoener
Thorpe
Vesey-Fitzgerald
Watson
Whitaker
Whittaker
Williams
Publication venue: 'Wiley'
Publication date: 01/01/2011
Field of study

In island ecosystems, reptiles play diverse ecological roles as a result of niche broadening, which increases potential niche overlap between species. Ecological niche partitioning is a means of reducing direct competition between coexisting species and differences in habitat use among island gecko species have been suggested as a by-product of specialization to feeding on certain resources. Here, we examine modes and drivers of niche partitioning of two endemic species of Phelsuma gecko (Phelsuma sundbergi and Phelsuma astriata) in relict native palm forest in the Seychelles to further understanding of congeneric reptile co-existence in native habitats. Phelsuma abundance, microhabitat use and habitat composition were quantified in different macrohabitat types. P. sundbergi showed a clear preference for habitat dominated by the coco de mer palm, Lodoicea maldivica and a strong association with male individuals of this dioecious species. P. astriata density increased significantly with arboreal biodiversity but did not display a relationship with a specific tree type. High levels of resource segregation were determined along the microhabitat axis, based on differential tree preference. Our results suggest that P. sundbergi and P. astriata may have evolved to co-exist in this habitat type through partitioning of microhabitat as members of a divergent specialist/generalist assemblage determined by consumption of L. maldivica pollen by P. sundbergi. Our findings concur with the hypothesis that differences in habitat use among island reptiles are a by-product of trophic specialization and support the conservation of native habitat for maintenance of reptile diversity

TUbiblio

Crossref

University of East Anglia digital repository

Continuous Nearest Neighbor Queries over Sliding Windows

Author: MOURATIDIS Kyriakos
Papadias Dimitris
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2007
Field of study

Abstract—This paper studies continuous monitoring of nearest neighbor (NN) queries over sliding window streams. According to this model, data points continuously stream in the system, and they are considered valid only while they belong to a sliding window that contains 1) the W most recent arrivals (count-based) or 2) the arrivals within a fixed interval W covering the most recent time stamps (time-based). The task of the query processor is to constantly maintain the result of long-running NN queries among the valid data. We present two processing techniques that apply to both count-based and time-based windows. The first one adapts conceptual partitioning, the best existing method for continuous NN monitoring over update streams, to the sliding window model. The second technique reduces the problem to skyline maintenance in the distance-time space and precomputes the future changes in the NN set. We analyze the performance of both algorithms and extend them to variations of NN search. Finally, we compare their efficiency through a comprehensive experimental evaluation. The skyline-based algorithm achieves lower CPU cost, at the expense of slightly larger space overhead. Index Terms—Location-dependent and sensitive, spatial databases, query processing, nearest neighbors, data streams, sliding windows.

CiteSeerX

Crossref

Institutional Knowledge at Singapore Management University

Hong Kong University of Science and Technology Institutional Repository

Adaptive inferential sensors based on evolving fuzzy models

Author: Angelov Plamen
Kordon Arthur
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/04/2010
Field of study

A new technique to the design and use of inferential sensors in the process industry is proposed in this paper, which is based on the recently introduced concept of evolving fuzzy models (EFMs). They address the challenge that the modern process industry faces today, namely, to develop such adaptive and self-calibrating online inferential sensors that reduce the maintenance costs while keeping the high precision and interpretability/transparency. The proposed new methodology makes possible inferential sensors to recalibrate automatically, which reduces significantly the life-cycle efforts for their maintenance. This is achieved by the adaptive and flexible open-structure EFM used. The novelty of this paper lies in the following: (1) the overall concept of inferential sensors with evolving and self-developing structure from the data streams; (2) the new methodology for online automatic selection of input variables that are most relevant for the prediction; (3) the technique to detect automatically a shift in the data pattern using the age of the clusters (and fuzzy rules); (4) the online standardization technique used by the learning procedure of the evolving model; and (5) the application of this innovative approach to several real-life industrial processes from the chemical industry (evolving inferential sensors, namely, eSensors, were used for predicting the chemical properties of different products in The Dow Chemical Company, Freeport, TX). It should be noted, however, that the methodology and conclusions of this paper are valid for the broader area of chemical and process industries in general. The results demonstrate that well-interpretable and with-simple-structure inferential sensors can automatically be designed from the data stream in real time, which predict various process variables of interest. The proposed approach can be used as a basis for the development of a new generation of adaptive and evolving inferential sensors that can a- ddress the challenges of the modern advanced process industry

Lancaster E-Prints

Building Efficient Query Engines in a High-Level Language

Author: Klonatos Yannis
Koch Christoph
Shaikhha Amir
Publication venue
Publication date: 16/12/2016
Field of study

Abstraction without regret refers to the vision of using high-level programming languages for systems development without experiencing a negative impact on performance. A database system designed according to this vision offers both increased productivity and high performance, instead of sacrificing the former for the latter as is the case with existing, monolithic implementations that are hard to maintain and extend. In this article, we realize this vision in the domain of analytical query processing. We present LegoBase, a query engine written in the high-level language Scala. The key technique to regain efficiency is to apply generative programming: LegoBase performs source-to-source compilation and optimizes the entire query engine by converting the high-level Scala code to specialized, low-level C code. We show how generative programming allows to easily implement a wide spectrum of optimizations, such as introducing data partitioning or switching from a row to a column data layout, which are difficult to achieve with existing low-level query compilers that handle only queries. We demonstrate that sufficiently powerful abstractions are essential for dealing with the complexity of the optimization effort, shielding developers from compiler internals and decoupling individual optimizations from each other. We evaluate our approach with the TPC-H benchmark and show that: (a) With all optimizations enabled, LegoBase significantly outperforms a commercial database and an existing query compiler. (b) Programmers need to provide just a few hundred lines of high-level code for implementing the optimizations, instead of complicated low-level code that is required by existing query compilation approaches. (c) The compilation overhead is low compared to the overall execution time, thus making our approach usable in practice for compiling query engines

arXiv.org e-Print Archive

Infoscience - École polytechnique fédérale de Lausanne