Search CORE

19 research outputs found

Characterization of XML Functional Dependencies and their Interaction with DTDs

Author: Kot Lucja
White Walker
Publication venue: 'SAGE Publications'
Publication date: 31/07/2006
Field of study

With the rise of XML as a standard model of data exchange, XML functional dependencies (XFDs) have become important to areas such as key analysis, document normalization, and data integrity. XFDs are more complicated than relational functional dependencies because the set of XFDs satisfied by an XML document depends not only on the document values, but also the tree structure and corresponding DTD. In particular, constraints imposed by DTDs may alter the implications from a base set of XFDs, and may even be inconsistent with a set of XFDs. In this paper we examine the interaction between XFDs and DTDs. We present a sound and complete axiomatization for XFDs, both alone and in the presence of certain classes of DTDs. We show that these DTD classes form an axiomatic hierarchy, with the axioms at each level a proper superset of the previous. Furthermore, we show that consistency checking with respect to a set of XFDs is feasible for these same classes

eCommons@Cornell

The Complexity of Social Coordination

Author: Gehrke Johannes
Kot Lucja
Mamouras Konstantinos
Oren Sigal
Seeman Lior
Publication venue
Publication date: 01/01/2012
Field of study

Coordination is a challenging everyday task; just think of the last time you organized a party or a meeting involving several people. As a growing part of our social and professional life goes online, an opportunity for an improved coordination process arises. Recently, Gupta et al. proposed entangled queries as a declarative abstraction for data-driven coordination, where the difficulty of the coordination task is shifted from the user to the database. Unfortunately, evaluating entangled queries is very hard, and thus previous work considered only a restricted class of queries that satisfy safety (the coordination partners are fixed) and uniqueness (all queries need to be satisfied). In this paper we significantly extend the class of feasible entangled queries beyond uniqueness and safety. First, we show that we can simply drop uniqueness and still efficiently evaluate a set of safe entangled queries. Second, we show that as long as all users coordinate on the same set of attributes, we can give an efficient algorithm for coordination even if the set of queries does not satisfy safety. In an experimental evaluation we show that our algorithms are feasible for a wide spectrum of coordination scenarios.Comment: VLDB201

arXiv.org e-Print Archive

CiteSeerX

The Homeostasis Protocol: Avoiding Transaction Coordination Through Program Analysis

Author: Bender Gabriel
Ding Bailu
Foster Nate
Gehrke Johannes
Hojjat Hossein
Koch Christoph
Kot Lucja
Roy Sudip
Publication venue
Publication date: 19/01/2015
Field of study

Datastores today rely on distribution and replication to achieve improved performance and fault-tolerance. But correctness of many applications depends on strong consistency properties - something that can impose substantial overheads, since it requires coordinating the behavior of multiple nodes. This paper describes a new approach to achieving strong consistency in distributed systems while minimizing communication between nodes. The key insight is to allow the state of the system to be inconsistent during execution, as long as this inconsistency is bounded and does not affect transaction correctness. In contrast to previous work, our approach uses program analysis to extract semantic information about permissible levels of inconsistency and is fully automated. We then employ a novel homeostasis protocol to allow sites to operate independently, without communicating, as long as any inconsistency is governed by appropriate treaties between the nodes. We discuss mechanisms for optimizing treaties based on workload characteristics to minimize communication, as well as a prototype implementation and experiments that demonstrate the benefits of our approach on common transactional benchmarks

arXiv.org e-Print Archive

Infoscience - École polytechnique fédérale de Lausanne

CiteSeerX

Fine-grained disclosure control for app ecosystems

Author: Bender Gabriel
Gehrke Johannes
Koch Christoph
Kot Lucja
Publication venue: ACM Press
Publication date: 01/01/2013
Field of study

The modern computing landscape contains an increasing number of app ecosystems, where users store personal data on platforms such as Facebook or smartphones. APIs enable third-party applications (apps) to utilize that data. A key concern associated with app ecosystems is the confidentiality of user data. In this paper, we develop a new model of disclosure in app ecosystems. In contrast with previous solutions, our model is data-derived and semantically meaningful. Information disclosure is modeled in terms of a set of distinguished security views. Each query is labeled with the precise set of security views that is needed to answer it, and these labels drive policy decisions. We explain how our disclosure model can be used in practice and provide algorithms for labeling conjunctive queries for the case of single-atom security views. We show that our approach is useful by demonstrating the scalability of our algorithms and by applying it to the real-world disclosure control system used by Facebook

Infoscience - École polytechnique fédérale de Lausanne

CiteSeerX

Entangled queries: enabling declarative data-driven coordination

Author: Bender Gabriel
Gehrke Johannes
Gupta Nitin
Koch Christoph
Kot Lucja
Roy Sudip
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 22/06/2011
Field of study

Many data-driven social and Web applications involve collaboration and coordination. The vision of Declarative Data-Driven Coordination (D3C), proposed in Kot et al. [2010], is to support coordination in the spirit of data management: to make it data-centric and to specify it using convenient declarative languages. This article introduces entangled queries, a language that extends SQL by constraints that allow for the coordinated choice of result tuples across queries originating from different users or applications. It is nontrivial to define a declarative coordination formalism without arriving at the general (NP-complete) Constraint Satisfaction Problem from AI. In this article, we propose an efficiently enforceable syntactic safety condition that we argue is at the sweet spot where interesting declarative power meets applicability in large-scale data management systems and applications. The key computational problem of D3C is to match entangled queries to achieve coordination. We present an efficient matching algorithm which statically analyzes query workloads and merges coordinating entangled queries into compound SQL queries. These can be sent to a standard database system and return only coordinated results. We present the overall architecture of an implemented system that contains our evaluation algorithm. We also describe a proof-of-concept Facebook application we have built on top of this system to allow friends to coordinate flight plans. Finally, we evaluate the performance of the matching algorithm experimentally on realistic coordination workloads

Infoscience - École polytechnique fédérale de Lausanne

Coordination through querying in the Youtopia system

Author: Bender Gabriel
Gehrke Johannes
Gupta Nitin
Koch Christoph
Kot Lucja
Roy Sudip
Publication venue
Publication date: 01/01/2011
Field of study

Infoscience - École polytechnique fédérale de Lausanne

Crossref

Kleene Algebra and Bytecode Verification

Author: Kot Lucja
Kozen Dexter
Publication venue: 'SAGE Publications'
Publication date: 20/12/2004
Field of study

Most standard approaches to the static analysis of programs, such as the popular worklist method, are first-order methods that inductively annotate program points with abstract values. In a recent paper we introduced a second-order approach based on Kleene algebra. In this approach, the primary objects of interest are not the abstract data values, but the transfer functions that manipulate them. These elements form a Kleene algebra. The dataflow labeling is not achieved by inductively labeling the program with abstract values, but rather by computing the star (Kleene closure) of a matrix of transfer functions. In this paper we show how this general framework applies to the problem of Java bytecode verification.We show how to specify transfer functions arising in Java bytecode verification in such a way that the Kleene algebra operations (join, composition, star) can be computed efficiently. We also give a hybrid dataflow analysis algorithm that computes the closure of a matrix on a cutset of the control flow graph, thereby avoiding the recalculation of dataflow information along long paths. This method could potentially improve the performance over the standard worklist algorithm when a small cutset can be found

Elsevier - Publisher Connector

eCommons@Cornell

Second-Order Abstract Interpretation via Kleene Algebra

Author: Kot Lucja
Kozen Dexter
Publication venue: 'SAGE Publications'
Publication date: 20/12/2004
Field of study

eCommons@Cornell

Youtopia: A Community Database Management System

Author: Kot Lucja
Publication venue
Publication date: 09/04/2010
Field of study

This thesis introduces Youtopia, a system for collaborative management of relational data. In the age of Web 2.0, the sharing of relational data by communities is an increasingly important phenomenon. As a data management setting, it poses unique technical challenges which warrant dedicated solutions. We focus on two key aspects of Youtopia functionality: data cleanliness maintenance and handling interference between user tasks – or transactions – that access the same data. We present a set of operations for maintaining data cleanliness in a collaborative manner. A unique feature of our operation set is the mechanism for enforcing tuplegenerating dependencies (a formalism for constraint maintenance). In Youtopia, these are enforced through a process based on the classical chase that involves both automated and human-assisted steps. Next, we examine the issue of transaction interference. We provide a suitable deﬁnition of serializability for the Youtopia transaction model and present an algorithm framework that can be used to enforce it. We illustrate our framework with an extended case study that includes experimental results from the implementation of our algorithms. Finally, we begin the investigation of concurrency control notions other than serializability. We present several isolation levels for transactions which are less restrictive than full serializability, but can be implemented with much more lightweight mechanisms and thus enable the system to achieve better throughput and latency in the processing of user operations

eCommons@Cornell