1,381 research outputs found
Complexity of coalition structure generation
We revisit the coalition structure generation problem in which the goal is to
partition the players into exhaustive and disjoint coalitions so as to maximize
the social welfare. One of our key results is a general polynomial-time
algorithm to solve the problem for all coalitional games provided that player
types are known and the number of player types is bounded by a constant. As a
corollary, we obtain a polynomial-time algorithm to compute an optimal
partition for weighted voting games with a constant number of weight values and
for coalitional skill games with a constant number of skills. We also consider
well-studied and well-motivated coalitional games defined compactly on
combinatorial domains. For these games, we characterize the complexity of
computing an optimal coalition structure by presenting polynomial-time
algorithms, approximation algorithms, or NP-hardness and inapproximability
lower bounds.Comment: 17 page
Shapley Meets Shapley
This paper concerns the analysis of the Shapley value in matching games.
Matching games constitute a fundamental class of cooperative games which help
understand and model auctions and assignments. In a matching game, the value of
a coalition of vertices is the weight of the maximum size matching in the
subgraph induced by the coalition. The Shapley value is one of the most
important solution concepts in cooperative game theory.
After establishing some general insights, we show that the Shapley value of
matching games can be computed in polynomial time for some special cases:
graphs with maximum degree two, and graphs that have a small modular
decomposition into cliques or cocliques (complete k-partite graphs are a
notable special case of this). The latter result extends to various other
well-known classes of graph-based cooperative games.
We continue by showing that computing the Shapley value of unweighted
matching games is #P-complete in general. Finally, a fully polynomial-time
randomized approximation scheme (FPRAS) is presented. This FPRAS can be
considered the best positive result conceivable, in view of the #P-completeness
result.Comment: 17 page
Qualitative Effects of Knowledge Rules in Probabilistic Data Integration
One of the problems in data integration is data overlap: the fact that different data sources have data on the same real world entities. Much development time in data integration projects is devoted to entity resolution. Often advanced similarity measurement techniques are used to remove semantic duplicates from the integration result or solve other semantic conflicts, but it proofs impossible to get rid of all semantic problems in data integration. An often-used rule of thumb states that about 90% of the development effort is devoted to solving the remaining 10% hard cases. In an attempt to significantly decrease human effort at data integration time, we have proposed an approach that stores any remaining semantic uncertainty and conflicts in a probabilistic database enabling it to already be meaningfully used. The main development effort in our approach is devoted to defining and tuning knowledge rules and thresholds. Rules and thresholds directly impact the size and quality of the integration result. We measure integration quality indirectly by measuring the quality of answers to queries on the integrated data set in an information retrieval-like way. The main contribution of this report is an experimental investigation of the effects and sensitivity of rule definition and threshold tuning on the integration quality. This proves that our approach indeed reduces development effort — and not merely shifts the effort to rule definition and threshold tuning — by showing that setting rough safe thresholds and defining only a few rules suffices to produce a ‘good enough’ integration that can be meaningfully used
IMPrECISE: Good-is-good-enough data integration
IMPrECISE is an XQuery module that adds probabilistic XML functionality to an existing XML DBMS, in our case MonetDB/XQuery. We demonstrate probabilistic XML and data integration functionality of IMPrECISE. The prototype is configurable with domain knowledge such that the amount of uncertainty arising during data integration is reduced to an acceptable level, thus obtaining a "good is good enough" data integration with minimal human effort
Rule-based information integration
In this report, we show the process of information integration. We specifically discuss the language used for integration. We show that integration consists of two phases, the schema mapping phase and the data integration phase. We formally define transformation rules, conversion, evolution and versioning. We further discuss the integration process from a data point of view
Quality Measures in Uncertain Data Management
Many applications deal with data that is uncertain. Some examples are applications dealing with sensor information, data integration applications and healthcare applications. Instead of these applications having to deal with the uncertainty, it should be the responsibility of the DBMS to manage all data including uncertain data. Several projects do research on this topic. In this paper, we introduce four measures to be used to assess and compare important characteristics of data and systems
A probabilistic database extension
Data exchange between embedded systems and other small or large computing devices increases. Since data in different data sources may refer to the same real world objects, data cannot simply be merged. Furthermore, in many situations, conflicts in data about the same real world objects need to be resolved without interference from a user. In this report, we report on an attempt to make a RDBMS probabilistic, i.e., data in a relation represents all possible views on the real world, in order to achieve unattended data integration. We define a probabilistic relational data model and review standard SQL query primitives in the light of probabilistic data. It appears that thinking in terms of `possible worlds¿ is powerful in determining the proper semantics of these query primitives
Information Integration - the process of integration, evolution and versioning
At present, many information sources are available wherever you are. Most of the time, the information needed is spread across several of those information sources. Gathering this information is a tedious and time consuming job. Automating this process would assist the user in its task. Integration of the information sources provides a global information source with all information needed present. All of these information sources also change over time. With each change of the information source, the schema of this source can be changed as well. The data contained in the information source, however, cannot be changed every time, due to the huge amount of data that would have to be converted in order to conform to the most recent schema.\ud
In this report we describe the current methods to information integration, evolution and versioning. We distinguish between integration of schemas and integration of the actual data. We also show some key issues when integrating XML data sources
User Feedback in Probabilistic XML
Data integration is a challenging problem in many application areas. Approaches mostly attempt to resolve semantic uncertainty and conflicts between information sources as part of the data integration process. In some application areas, this is impractical or even prohibitive, for example, in an ambient environment where devices on an ad hoc basis have to exchange information autonomously. We have proposed a probabilistic XML approach that allows data integration without user involvement by storing semantic uncertainty and conflicts in the integrated XML data. As a\ud
consequence, the integrated information source represents\ud
all possible appearances of objects in the real world, the\ud
so-called possible worlds.\ud
\ud
In this paper, we show how user feedback on query results\ud
can resolve semantic uncertainty and conflicts in the\ud
integrated data. Hence, user involvement is effectively postponed to query time, when a user is already interacting actively with the system. The technique relates positive and\ud
negative statements on query answers to the possible worlds\ud
of the information source thereby either reinforcing, penalizing, or eliminating possible worlds. We show that after repeated user feedback, an integrated information source better resembles the real world and may converge towards a non-probabilistic information source
- …