Search CORE

8,030 research outputs found

DataSpread: Unifying Databases and Spreadsheets.

Author: Aditya Parameswaran
Bofan Sun
Ding Zhang
Kevin Chang
Mangesh Bendre
Shy-yauer Lin
Xinyan Zhou
Publication venue: eScholarship, University of California
Publication date: 01/08/2015
Field of study

Spreadsheet software is often the tool of choice for ad-hoc tabular data management, processing, and visualization, especially on tiny data sets. On the other hand, relational database systems offer significant power, expressivity, and efficiency over spreadsheet software for data management, while lacking in the ease of use and ad-hoc analysis capabilities. We demonstrate DataSpread, a data exploration tool that holistically unifies databases and spreadsheets. It continues to offer a Microsoft Excel-based spreadsheet front-end, while in parallel managing all the data in a back-end database, specifically, PostgreSQL. DataSpread retains all the advantages of spreadsheets, including ease of use, ad-hoc analysis and visualization capabilities, and a schema-free nature, while also adding the advantages of traditional relational databases, such as scalability and the ability to use arbitrary SQL to import, filter, or join external or internal tables and have the results appear in the spreadsheet. DataSpread needs to reason about and reconcile differences in the notions of schema, addressing of cells and tuples, and the current pane (which exists in spreadsheets but not in traditional databases), and support data modifications at both the front-end and the back-end. Our demonstration will center on our first and early prototype of the DataSpread, and will give the attendees a sense for the enormous data exploration capabilities offered by unifying spreadsheets and databases

CiteSeerX

eScholarship - University of California

Time-Aware Probabilistic Knowledge Graphs

Author: Chekol Melisachew Wudage
Stuckenschmidt Heiner
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 26th International Symposium on Temporal Representation and Reasoning (TIME 2019)
Publication date: 01/01/2019
Field of study

The emergence of open information extraction as a tool for constructing and expanding knowledge graphs has aided the growth of temporal data, for instance, YAGO, NELL and Wikidata. While YAGO and Wikidata maintain the valid time of facts, NELL records the time point at which a fact is retrieved from some Web corpora. Collectively, these knowledge graphs (KG) store facts extracted from Wikipedia and other sources. Due to the imprecise nature of the extraction tools that are used to build and expand KG, such as NELL, the facts in the KG are weighted (a confidence value representing the correctness of a fact). Additionally, NELL can be considered as a transaction time KG because every fact is associated with extraction date. On the other hand, YAGO and Wikidata use the valid time model because they maintain facts together with their validity time (temporal scope). In this paper, we propose a bitemporal model (that combines transaction and valid time models) for maintaining and querying bitemporal probabilistic knowledge graphs. We study coalescing and scalability of marginal and MAP inference. Moreover, we show that complexity of reasoning tasks in atemporal probabilistic KG carry over to the bitemporal setting. Finally, we report our evaluation results of the proposed model

Dagstuhl Research Online Publication Server

MonetDB/XQuery - Consistent & Efficient Updates on the Pre/Post Plane

Author: Boncz Peter
Flokstra Jan
Grust Torsten
Keulen Maurice van
Manegold Stefan
Mullender Sjoerd
Rittinger Jan
Teubner Jens
Publication venue: Springer
Publication date: 01/01/2006
Field of study

Relational XQuery processors aim at leveraging mature relational DBMS query processing technology to provide scalability and efficiency. To achieve this goal, various storage schemes have been proposed to encode the tree structure of XML documents in flat relational tables. Basically, two classes can be identified: (1) encodings using fixed-length surrogates, like the preorder ranks in the pre/post encoding [5] or the equivalent pre/size/level encoding [8], and (2) encodings using variable-length surrogates, like, e.g., ORDPATH [9] or P-PBiTree [12]. Recent research [1] showed a clear advantage of the former for efficient evaluation of XPath location steps, exploiting techniques like cheap node order tests, positional lookup, and node skipping in staircase join [7]. However, once updates are involved, variable-length surrogates are often considered the better choice, mainly as a straightforward implementation of structural XML updates using fixed-length surrogates faces two performance bottlenecks: (i) high physical cost (the preorder ranks of all nodes following the update position must be modified—on average 50% of the document), and (ii) low transaction concurrency (updating the size of all ancestor nodes causes lock contention on the document root)

CiteSeerX

University of Twente Research Information

Efficient Processing of Exact Top-k Queries over Disk-Resident Sorted Lists

Author: A. Marian
A. Silberschatz
A. Spink
B. Arai
B. Bloom
Baihua Zheng
D.D. Lewis
F. Korn
G. Adomavicius
H.P. Hung
HweeHwa Pang
K. Yi
L. Zhu
M. Hua
M. Theobald
M.A. Soliman
M.L. Yiu
N. Bruno
N. Mamoulis
R. Baeza-Yates
R. Fagin
S. Brin
S. Chaudhuri
S. Hwang
Xuhua Ding
Y. Tao
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/06/2010
Field of study