Search CORE

4,595 research outputs found

Software Engineering and Complexity in Effective Algebraic Geometry

Author: Heintz Joos
Kuijpers Bart
Paredes Andres Rojas
Publication venue
Publication date: 25/04/2012
Field of study

We introduce the notion of a robust parameterized arithmetic circuit for the evaluation of algebraic families of multivariate polynomials. Based on this notion, we present a computation model, adapted to Scientific Computing, which captures all known branching parsimonious symbolic algorithms in effective Algebraic Geometry. We justify this model by arguments from Software Engineering. Finally we exhibit a class of simple elimination problems of effective Algebraic Geometry which require exponential time to be solved by branching parsimonious algorithms of our computation model.Comment: 70 pages. arXiv admin note: substantial text overlap with arXiv:1201.434

arXiv.org e-Print Archive

Crossref

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

CONICET Digital

Document Server@UHasselt (Universiteit Hasselt)

On the Complexity of $t$ -Closeness Anonymization and Related Problems

Author: D. Rebollo-Monedero
E. Anshelevich
J. Blocki
J. Cao
L. Sweeney
N. Li
P. Bonizzoni
P. Samarati
P.A. Evans
R. Bredereck
Y. Rubner
Publication venue
Publication date: 01/01/2013
Field of study

An important issue in releasing individual data is to protect the sensitive information from being leaked and maliciously utilized. Famous privacy preserving principles that aim to ensure both data privacy and data integrity, such as

k

-anonymity and

l

-diversity, have been extensively studied both theoretically and empirically. Nonetheless, these widely-adopted principles are still insufficient to prevent attribute disclosure if the attacker has partial knowledge about the overall sensitive data distribution. The

t

-closeness principle has been proposed to fix this, which also has the benefit of supporting numerical sensitive attributes. However, in contrast to

k

-anonymity and

l

-diversity, the theoretical aspect of

t

-closeness has not been well investigated. We initiate the first systematic theoretical study on the

t

-closeness principle under the commonly-used attribute suppression model. We prove that for every constant

t

such that

0\leq t<1

, it is NP-hard to find an optimal

t

-closeness generalization of a given table. The proof consists of several reductions each of which works for different values of

t

, which together cover the full range. To complement this negative result, we also provide exact and fixed-parameter algorithms. Finally, we answer some open questions regarding the complexity of

k

-anonymity and

l

-diversity left in the literature.Comment: An extended abstract to appear in DASFAA 201

arXiv.org e-Print Archive

Crossref

Algebraic optimization of recursive queries

Author: Apers Peter M.G.
Houtsma M.A.W.
Houtsma Maurice A.W.
Publication venue: North Holland
Publication date: 01/01/1992
Field of study

Over the past few years, much attention has been paid to deductive databases. They offer a logic-based interface, and allow formulation of complex recursive queries. However, they do not offer appropriate update facilities, and do not support existing applications. To overcome these problems an SQL-like interface is required besides a logic-based interface.\ud \ud In the PRISMA project we have developed a tightly-coupled distributed database, on a multiprocessor machine, with two user interfaces: SQL and PRISMAlog. Query optimization is localized in one component: the relational query optimizer. Therefore, we have defined an eXtended Relational Algebra that allows recursive query formulation and can also be used for expressing executable schedules, and we have developed algebraic optimization strategies for recursive queries. In this paper we describe an optimization strategy that rewrites regular (in the context of formal grammars) mutually recursive queries into standard Relational Algebra and transitive closure operations. We also describe how to push selections into the resulting transitive closure operations.\ud \ud The reason we focus on algebraic optimization is that, in our opinion, the new generation of advanced database systems will be built starting from existing state-of-the-art relational technology, instead of building a completely new class of systems

CiteSeerX

University of Twente Research Information

Towards a Holistic Integration of Spreadsheets with Databases: A Scalable Storage Engine for Presentational Data Management

Author: Bendre Mangesh
Chang Kevin
Parameswaran Aditya
Venkataraman Vipul
Zhou Xinyan
Publication venue
Publication date: 05/10/2017
Field of study

Spreadsheet software is the tool of choice for interactive ad-hoc data management, with adoption by billions of users. However, spreadsheets are not scalable, unlike database systems. On the other hand, database systems, while highly scalable, do not support interactivity as a first-class primitive. We are developing DataSpread, to holistically integrate spreadsheets as a front-end interface with databases as a back-end datastore, providing scalability to spreadsheets, and interactivity to databases, an integration we term presentational data management (PDM). In this paper, we make a first step towards this vision: developing a storage engine for PDM, studying how to flexibly represent spreadsheet data within a database and how to support and maintain access by position. We first conduct an extensive survey of spreadsheet use to motivate our functional requirements for a storage engine for PDM. We develop a natural set of mechanisms for flexibly representing spreadsheet data and demonstrate that identifying the optimal representation is NP-Hard; however, we develop an efficient approach to identify the optimal representation from an important and intuitive subclass of representations. We extend our mechanisms with positional access mechanisms that don't suffer from cascading update issues, leading to constant time access and modification performance. We evaluate these representations on a workload of typical spreadsheets and spreadsheet operations, providing up to 20% reduction in storage, and up to 50% reduction in formula evaluation time

arXiv.org e-Print Archive

Crossref

Elaboration in Dependent Type Theory

Author: Avigad Jeremy
de Moura Leonardo
Kong Soonho
Roux Cody
Publication venue
Publication date: 17/12/2015
Field of study

To be usable in practice, interactive theorem provers need to provide convenient and efficient means of writing expressions, definitions, and proofs. This involves inferring information that is often left implicit in an ordinary mathematical text, and resolving ambiguities in mathematical expressions. We refer to the process of passing from a quasi-formal and partially-specified expression to a completely precise formal one as elaboration. We describe an elaboration algorithm for dependent type theory that has been implemented in the Lean theorem prover. Lean's elaborator supports higher-order unification, type class inference, ad hoc overloading, insertion of coercions, the use of tactics, and the computational reduction of terms. The interactions between these components are subtle and complex, and the elaboration algorithm has been carefully designed to balance efficiency and usability. We describe the central design goals, and the means by which they are achieved

arXiv.org e-Print Archive

CiteSeerX

Making Queries Tractable on Big Data with Preprocessing

Author: Fan Wenfei
Geerts Floris
Neven Frank
Publication venue
Publication date: 01/01/2013
Field of study

A query class is traditionally considered tractable if there exists a polynomial-time (PTIME) algorithm to answer its queries. When it comes to big data, however, PTIME al-gorithms often become infeasible in practice. A traditional and effective approach to coping with this is to preprocess data off-line, so that queries in the class can be subsequently evaluated on the data efficiently. This paper aims to pro-vide a formal foundation for this approach in terms of com-putational complexity. (1) We propose a set of Π-tractable queries, denoted by ΠT0Q, to characterize classes of queries that can be answered in parallel poly-logarithmic time (NC) after PTIME preprocessing. (2) We show that several natu-ral query classes are Π-tractable and are feasible on big data. (3) We also study a set ΠTQ of query classes that can be ef-fectively converted to Π-tractable queries by re-factorizing its data and queries for preprocessing. We introduce a form of NC reductions to characterize such conversions. (4) We show that a natural query class is complete for ΠTQ. (5) We also show that ΠT0Q ⊂ P unless P = NC, i.e., the set ΠT0Q of all Π-tractable queries is properly contained in the set P of all PTIME queries. Nonetheless, ΠTQ = P, i.e., all PTIME query classes can be made Π-tractable via proper re-factorizations. This work is a step towards understanding the tractability of queries in the context of big data. 1

CiteSeerX

Edinburgh Research Explorer

Algorithms for Provisioning Queries and Analytics

Author: Assadi Sepehr
Khanna Sanjeev
Li Yang
Tannen Val
Publication venue
Publication date: 18/12/2015
Field of study

Provisioning is a technique for avoiding repeated expensive computations in what-if analysis. Given a query, an analyst formulates

k

hypotheticals, each retaining some of the tuples of a database instance, possibly overlapping, and she wishes to answer the query under scenarios, where a scenario is defined by a subset of the hypotheticals that are "turned on". We say that a query admits compact provisioning if given any database instance and any

k

hypotheticals, one can create a poly-size (in

k

) sketch that can then be used to answer the query under any of the

2^{k}

possible scenarios without accessing the original instance. In this paper, we focus on provisioning complex queries that combine relational algebra (the logical component), grouping, and statistics/analytics (the numerical component). We first show that queries that compute quantiles or linear regression (as well as simpler queries that compute count and sum/average of positive values) can be compactly provisioned to provide (multiplicative) approximate answers to an arbitrary precision. In contrast, exact provisioning for each of these statistics requires the sketch size to be exponential in

k

. We then establish that for any complex query whose logical component is a positive relational algebra query, as long as the numerical component can be compactly provisioned, the complex query itself can be compactly provisioned. On the other hand, introducing negation or recursion in the logical component again requires the sketch size to be exponential in

k

. While our positive results use algorithms that do not access the original instance after a scenario is known, we prove our lower bounds even for the case when, knowing the scenario, limited access to the instance is allowed

arXiv.org e-Print Archive

Dagstuhl Research Online Publication Server