Search CORE

1,381 research outputs found

Complexity of coalition structure generation

Author: Aziz Haris
de Keijzer Bart
Publication venue
Publication date: 01/01/2011
Field of study

We revisit the coalition structure generation problem in which the goal is to partition the players into exhaustive and disjoint coalitions so as to maximize the social welfare. One of our key results is a general polynomial-time algorithm to solve the problem for all coalitional games provided that player types are known and the number of player types is bounded by a constant. As a corollary, we obtain a polynomial-time algorithm to compute an optimal partition for weighted voting games with a constant number of weight values and for coalitional skill games with a constant number of skills. We also consider well-studied and well-motivated coalitional games defined compactly on combinatorial domains. For these games, we characterize the complexity of computing an optimal coalition structure by presenting polynomial-time algorithms, approximation algorithms, or NP-hardness and inapproximability lower bounds.Comment: 17 page

arXiv.org e-Print Archive

CiteSeerX

CWI's Institutional Repository

Shapley Meets Shapley

Author: Aziz Haris
de Keijzer Bart
Publication venue
Publication date: 01/07/2013
Field of study

This paper concerns the analysis of the Shapley value in matching games. Matching games constitute a fundamental class of cooperative games which help understand and model auctions and assignments. In a matching game, the value of a coalition of vertices is the weight of the maximum size matching in the subgraph induced by the coalition. The Shapley value is one of the most important solution concepts in cooperative game theory. After establishing some general insights, we show that the Shapley value of matching games can be computed in polynomial time for some special cases: graphs with maximum degree two, and graphs that have a small modular decomposition into cliques or cocliques (complete k-partite graphs are a notable special case of this). The latter result extends to various other well-known classes of graph-based cooperative games. We continue by showing that computing the Shapley value of unweighted matching games is #P-complete in general. Finally, a fully polynomial-time randomized approximation scheme (FPRAS) is presented. This FPRAS can be considered the best positive result conceivable, in view of the #P-completeness result.Comment: 17 page

arXiv.org e-Print Archive

Dagstuhl Research Online Publication Server

Qualitative Effects of Knowledge Rules in Probabilistic Data Integration

Author: Keijzer A. de
Keulen M. van
Publication venue: Centre for Telematics and Information Technology, University of Twente
Publication date: 01/01/2008
Field of study

One of the problems in data integration is data overlap: the fact that different data sources have data on the same real world entities. Much development time in data integration projects is devoted to entity resolution. Often advanced similarity measurement techniques are used to remove semantic duplicates from the integration result or solve other semantic conflicts, but it proofs impossible to get rid of all semantic problems in data integration. An often-used rule of thumb states that about 90% of the development effort is devoted to solving the remaining 10% hard cases. In an attempt to significantly decrease human effort at data integration time, we have proposed an approach that stores any remaining semantic uncertainty and conflicts in a probabilistic database enabling it to already be meaningfully used. The main development effort in our approach is devoted to defining and tuning knowledge rules and thresholds. Rules and thresholds directly impact the size and quality of the integration result. We measure integration quality indirectly by measuring the quality of answers to queries on the integrated data set in an information retrieval-like way. The main contribution of this report is an experimental investigation of the effects and sensitivity of rule definition and threshold tuning on the integration quality. This proves that our approach indeed reduces development effort — and not merely shifts the effort to rule definition and threshold tuning — by showing that setting rough safe thresholds and defining only a few rules suffices to produce a ‘good enough’ integration that can be meaningfully used

CiteSeerX

University of Twente Research Information

IMPrECISE: Good-is-good-enough data integration

Author: Keijzer Ander de
Keulen Maurice van
Publication venue: IEEE Computer Society Press
Publication date: 01/01/2008
Field of study

IMPrECISE is an XQuery module that adds probabilistic XML functionality to an existing XML DBMS, in our case MonetDB/XQuery. We demonstrate probabilistic XML and data integration functionality of IMPrECISE. The prototype is configurable with domain knowledge such that the amount of uncertainty arising during data integration is reduced to an acceptable level, thus obtaining a "good is good enough" data integration with minimal human effort

CiteSeerX

University of Twente Research Information

Rule-based information integration

Author: Keijzer Ander de
Keulen Maurice van
Publication venue: University of Twente, Centre for Telematica and Information Technology (CTIT)
Publication date: 01/01/2005
Field of study

In this report, we show the process of information integration. We specifically discuss the language used for integration. We show that integration consists of two phases, the schema mapping phase and the data integration phase. We formally define transformation rules, conversion, evolution and versioning. We further discuss the integration process from a data point of view

University of Twente Research Information

Quality Measures in Uncertain Data Management

Author: Keijzer A. de
Keulen M. van
Publication venue: Springer Verlag
Publication date: 01/01/2007
Field of study

Many applications deal with data that is uncertain. Some examples are applications dealing with sensor information, data integration applications and healthcare applications. Instead of these applications having to deal with the uncertainty, it should be the responsibility of the DBMS to manage all data including uncertain data. Several projects do research on this topic. In this paper, we introduce four measures to be used to assess and compare important characteristics of data and systems

University of Twente Research Information

A probabilistic database extension

Author: Keijzer Ander de
Keulen Maurice van
Publication venue: University of Twente, Centre for Telematics and Information Technology (CTIT)
Publication date: 01/01/2004
Field of study

Data exchange between embedded systems and other small or large computing devices increases. Since data in different data sources may refer to the same real world objects, data cannot simply be merged. Furthermore, in many situations, conflicts in data about the same real world objects need to be resolved without interference from a user. In this report, we report on an attempt to make a RDBMS probabilistic, i.e., data in a relation represents all possible views on the real world, in order to achieve unattended data integration. We define a probabilistic relational data model and review standard SQL query primitives in the light of probabilistic data. It appears that thinking in terms of `possible worlds¿ is powerful in determining the proper semantics of these query primitives

University of Twente Research Information

Information Integration - the process of integration, evolution and versioning

Author: Keijzer Ander de
Keulen Maurice van
Publication venue: University of Twente, Centre for Telematica and Information Technology (CTIT)
Publication date: 01/01/2005
Field of study

At present, many information sources are available wherever you are. Most of the time, the information needed is spread across several of those information sources. Gathering this information is a tedious and time consuming job. Automating this process would assist the user in its task. Integration of the information sources provides a global information source with all information needed present. All of these information sources also change over time. With each change of the information source, the schema of this source can be changed as well. The data contained in the information source, however, cannot be changed every time, due to the huge amount of data that would have to be converted in order to conform to the most recent schema.\ud In this report we describe the current methods to information integration, evolution and versioning. We distinguish between integration of schemas and integration of the actual data. We also show some key issues when integrating XML data sources

University of Twente Research Information

User Feedback in Probabilistic XML

Author: Keijzer A. de
Keulen M. van
Publication venue: Centre for Telematics and Information Technology, University of Twente
Publication date: 01/01/2007
Field of study

Data integration is a challenging problem in many application areas. Approaches mostly attempt to resolve semantic uncertainty and conflicts between information sources as part of the data integration process. In some application areas, this is impractical or even prohibitive, for example, in an ambient environment where devices on an ad hoc basis have to exchange information autonomously. We have proposed a probabilistic XML approach that allows data integration without user involvement by storing semantic uncertainty and conflicts in the integrated XML data. As a\ud consequence, the integrated information source represents\ud all possible appearances of objects in the real world, the\ud so-called possible worlds.\ud \ud In this paper, we show how user feedback on query results\ud can resolve semantic uncertainty and conflicts in the\ud integrated data. Hence, user involvement is effectively postponed to query time, when a user is already interacting actively with the system. The technique relates positive and\ud negative statements on query answers to the possible worlds\ud of the information source thereby either reinforcing, penalizing, or eliminating possible worlds. We show that after repeated user feedback, an integrated information source better resembles the real world and may converge towards a non-probabilistic information source

University of Twente Research Information