Search CORE

162 research outputs found

On Independence Atoms and Keys

Author: Hannula Miika
Kontinen Juha
Link Sebastian
Publication venue
Publication date: 17/04/2014
Field of study

Uniqueness and independence are two fundamental properties of data. Their enforcement in database systems can lead to higher quality data, faster data service response time, better data-driven decision making and knowledge discovery from data. The applications can be effectively unlocked by providing efficient solutions to the underlying implication problems of keys and independence atoms. Indeed, for the sole class of keys and the sole class of independence atoms the associated finite and general implication problems coincide and enjoy simple axiomatizations. However, the situation changes drastically when keys and independence atoms are combined. We show that the finite and the general implication problems are already different for keys and unary independence atoms. Furthermore, we establish a finite axiomatization for the general implication problem, and show that the finite implication problem does not enjoy a k-ary axiomatization for any k

arXiv.org e-Print Archive

CiteSeerX

Crossref

On Shapley Value in Data Assemblage Under Independent Utility

Author: Cong Zicun
Luo Xuan
Pei Jian
Xu Cheng
Publication venue: 'VLDB Endowment'
Publication date: 01/08/2022
Field of study

In many applications, an organization may want to acquire data from many data owners. Data marketplaces allow data owners to produce data assemblage needed by data buyers through coalition. To encourage coalitions to produce data, it is critical to allocate revenue to data owners in a fair manner according to their contributions. Although in literature Shapley fairness and alternatives have been well explored to facilitate revenue allocation in data assemblage, computing exact Shapley value for many data owners and large assembled data sets through coalition remains challenging due to the combinatoric nature of Shapley value. In this paper, we explore the decomposability of utility in data assemblage by formulating the independent utility assumption. We argue that independent utility enjoys many applications. Moreover, we identify interesting properties of independent utility and develop fast computation techniques for exact Shapley value under independent utility. Our experimental results on a series of benchmark data sets show that our new approach not only guarantees the exactness of Shapley value, but also achieves faster computation by orders of magnitudes.Comment: Accepted by VLDB 202

arXiv.org e-Print Archive

Time dimension in the relational model

Author: Chaya Yerrapragada.
Publication venue
Publication date: 01/01/1987
Field of study

Call number: LD2668 .T4 CMSC 1987 C52Master of ScienceComputing and Information Science

K-State Research Exchange

Object-oriented data modeling

Author: Baltz Donald W
Publication venue: Digital Scholarship@UNLV
Publication date: 01/01/1989
Field of study

The object-oriented paradigm models local behavior, and to a lesser extent, the structure of a problem. Semantic data models describe structure and semantics. This thesis unifies the behavioral focus of the object-oriented paradigm with the structural and semantic focus of semantic data models. The approach contains expressive abstractions to model static and derived data, semantics, and behavior. The abstractions keep the data model closer to the problem domain, and can be translated into a relational (or other) implementation. The paper makes six contributions. First, a comprehensive set of data structuring abstractions are described. Second, the abstractions are compared to the entity-relationship and relational models. Third, semantic information inherent in the functional representation of the abstractions is identified. Fourth, a set of behavioral abstractions are described. Fifth, an algorithm that describes the dynamics between mathematically derived attributes of cooperating objects is presented. Sixth, weaknesses of object-oriented programming languages are identified

University of Nevada, Las Vegas Repository

Impact de l\u27organisation des documents électroniques sur l\u27interprétation de l\u27information organique et consignée dans un contexte décentralisée

Author: Mas Sabine
Publication venue: Ecole nationale supérieure des sciences de l\u27information et des bibliothèques [ENSSIB]
Publication date
Field of study

Intervention au colloque "Le numérique : impact sur le cycle de vie du document", organisé à l\u27université de Montréal par l\u27EBSI et l\u27ENSSIB du 13 au 15 octobre 2004. Dans un contexte de gestion documentaire décentralisée, l\u27organisation des documents électroniques est sous le contrôle direct des employés. Il est reconnu que ces derniers organisent ces documents électroniques selon des critères très personnels qui sont le plus souvent incompréhensibles pour les autres employés, rendant difficile le repérage et l\u27interprétation des documents. Cette communication se donne pour objet d\u27étudier plus avant le lien entre le mode d\u27organisation des documents électroniques et leur interprétation avant de conclure sur les besoins au niveau de la recherche et l\u27implication des résultats en matière de gestion du cycle de vie des documents

Bibliothèque numérique de l'enssib

Histogram techniques for cost estimation in query optimization.

Author
Publication venue
Publication date: 01/01/2001
Field of study

Yu Xiaohui.Thesis (M.Phil.)--Chinese University of Hong Kong, 2001.Includes bibliographical references (leaves 98-115).Abstracts in English and Chinese.Chapter 1 --- Introduction --- p.1Chapter 2 --- Related Work --- p.6Chapter 2.1 --- Query Optimization --- p.6Chapter 2.2 --- Query Rewriting --- p.8Chapter 2.2.1 --- Optimizing Multi-Block Queries --- p.8Chapter 2.2.2 --- Semantic Query Optimization --- p.13Chapter 2.2.3 --- Query Rewriting in Starburst --- p.15Chapter 2.3 --- Plan Generation --- p.16Chapter 2.3.1 --- Dynamic Programming Approach --- p.16Chapter 2.3.2 --- Join Query Processing --- p.17Chapter 2.3.3 --- Queries with Aggregates --- p.23Chapter 2.4 --- Statistics and Cost Estimation --- p.24Chapter 2.5 --- Histogram Techniques --- p.27Chapter 2.5.1 --- Definitions --- p.28Chapter 2.5.2 --- Trivial Histograms --- p.29Chapter 2.5.3 --- Heuristic-based Histograms --- p.29Chapter 2.5.4 --- V-Optimal Histograms --- p.32Chapter 2.5.5 --- Wavelet-based Histograms --- p.35Chapter 2.5.6 --- Multidimensional Histograms --- p.35Chapter 2.5.7 --- Global Histograms --- p.37Chapter 3 --- New Histogram Techniques --- p.39Chapter 3.1 --- Piecewise Linear Histograms --- p.39Chapter 3.1.1 --- Construction --- p.41Chapter 3.1.2 --- Usage --- p.43Chapter 3.1.3 --- Error Measures --- p.43Chapter 3.1.4 --- Experiments --- p.45Chapter 3.1.5 --- Conclusion --- p.51Chapter 3.2 --- A-Optimal Histograms --- p.54Chapter 3.2.1 --- A-Optimal(mean) Histograms --- p.56Chapter 3.2.2 --- A-Optimal(median) Histograms --- p.58Chapter 3.2.3 --- A-Optimal(median-cf) Histograms --- p.59Chapter 3.2.4 --- Experiments --- p.60Chapter 4 --- Global Histograms --- p.64Chapter 4.1 --- Wavelet-based Global Histograms --- p.65Chapter 4.1.1 --- Wavelet-based Global Histograms I --- p.66Chapter 4.1.2 --- Wavelet-based Global Histograms II --- p.68Chapter 4.2 --- Piecewise Linear Global Histograms --- p.70Chapter 4.3 --- A-Optimal Global Histograms --- p.72Chapter 4.3.1 --- Experiments --- p.74Chapter 5 --- Dynamic Maintenance --- p.81Chapter 5.1 --- Problem Definition --- p.83Chapter 5.2 --- Refining Bucket Coefficients --- p.84Chapter 5.3 --- Restructuring --- p.86Chapter 5.4 --- Experiments --- p.91Chapter 6 --- Conclusions --- p.95Bibliography --- p.9

CUHK Digital Repository

Databases and Artificial Intelligence

Author: Bidoit Nicole
Bosc Patrick
Cholvy Laurence
Pivert Olivier
Rousset Marie-Christine
Publication venue: HAL CCSD
Publication date: 07/07/2020
Field of study

International audienceThis chapter presents some noteworthy works which show the links between Databases and Artificial Intelligence. More precisely, after an introduction, Sect. 2 presents the seminal work on "logic and databases" which opened a wide research field at the intersection of databases and artificial intelligence. The main results concern the use of logic for database modeling. Then, in Sect. 3, we present different problems raised by integrity constraints and the way logic contributed to formalizing and solving them. In Sect. 4, we sum up some works related to queries with preferences. Section 5 finally focuses on the problematic of database integration

HAL-CentraleSupelec

Hal - Université Grenoble Alpes

INRIA a CCSD electronic archive server

Hal-Diderot

HAL-Rennes 1

Using crowdsourced geospatial data to aid in nuclear proliferation monitoring

Author: Leno Kenyon M.
Miller Steven J.
Publication venue: Monterey, California: Naval Postgraduate School
Publication date: 01/12/2016
Field of study

In 2014, a Defense Science Board Task Force was convened in order to assess and explore new technologies that would aid in nuclear proliferation monitoring. One of their recommendations was for the director of National Intelligence to explore ways that crowdsourced geospatial imagery technologies could aid existing governmental efforts. Our research builds directly on this recommendation and provides feedback on some of the most successful examples of crowdsourced geospatial data (CGD). As of 2016, Special Operations Command (SOCOM) has assumed the new role of becoming the primary U.S. agency responsible for counter-proliferation. Historically, this institution has always been reliant upon other organizations for the execution of its myriad of mission sets. SOCOM's unique ability to build relationships makes it particularly suited to the task of harnessing CGD technologies and employing them in the capacity that our research recommends. Furthermore, CGD is a low cost, high impact tool that is already being employed by commercial companies and non-profit groups around the world. By employing CGD, a wider whole-of-government effort can be created that provides a long term, cohesive engagement plan for facilitating a multi-faceted nuclear proliferation monitoring process.http://archive.org/details/usingcrowdsource1094551570Major, United States ArmyMajor, United States ArmyApproved for public release; distribution is unlimited

Calhoun, Institutional Archive of the Naval Postgraduate School

Disjunctively incomplete information in relational databases: modeling and related issues

Author: Zhang Lu
Publication venue: Iowa State University Digital Repository
Publication date: 01/01/1993
Field of study

In this dissertation, the issues related to the information incompleteness in relational databases are explored. In general, this dissertation can be divided into two parts. The first part extends the relational natural join operator and the update operations of insertion and deletion to I-tables, an extended relational model representing inclusively indefinite and maybe information, in a semantically correct manner. Rudimentary or naive algorithms for computing natural joins on I-tables require an exponential number of pair-up operations and block accesses proportional to the size of I-tables due to the combinatorial nature of natural joins on I-tables. Thus, the problem becomes intractable for large I-tables. An algorithm for computing natural joins under the extended model which reduces the number of pair-up operations to a linear order of complexity in general and in the worst case to a polynomial order of complexity with respect to the size of I-tables is proposed in this dissertation. In addition, this algorithm also reduces the number of block accesses to a linear order of complexity with respect to the size of I-tables;The second part is related to the modeling aspect of incomplete databases. An extended relational model, called E-table, is proposed. E-table is capable of representing exclusively disjunctive information. That is, disjunctions of the form P[subscript]1\mid P[subscript]2\mid·s\mid P[subscript]n, where ǁ denotes a generalized logical exclusive or indicating that exactly one of the P[subscript]i\u27s can be true. The information content of an E-table is precisely defined and relational operators of selection, projection, difference, union, intersection, and cartisian product are extended to E-tables in a semantically correct manner. Conditions under which redundancies could arise due to the presence of exclusively disjunctive information are characterized and the procedure for resolving redundancies is presented;Finally, this dissertation is concluded with discussions on the directions for further research in the area of incomplete information modeling. In particular, a sketch of a relational model, IE-table (Inclusive and Exclusive table), for representing both inclusively and exclusively disjunctive information is provided

Digital Repository @ Iowa State University (ISU)