10,884 research outputs found
User Feedback in Probabilistic XML
Data integration is a challenging problem in many application areas. Approaches mostly attempt to resolve semantic uncertainty and conflicts between information sources as part of the data integration process. In some application areas, this is impractical or even prohibitive, for example, in an ambient environment where devices on an ad hoc basis have to exchange information autonomously. We have proposed a probabilistic XML approach that allows data integration without user involvement by storing semantic uncertainty and conflicts in the integrated XML data. As a\ud
consequence, the integrated information source represents\ud
all possible appearances of objects in the real world, the\ud
so-called possible worlds.\ud
\ud
In this paper, we show how user feedback on query results\ud
can resolve semantic uncertainty and conflicts in the\ud
integrated data. Hence, user involvement is effectively postponed to query time, when a user is already interacting actively with the system. The technique relates positive and\ud
negative statements on query answers to the possible worlds\ud
of the information source thereby either reinforcing, penalizing, or eliminating possible worlds. We show that after repeated user feedback, an integrated information source better resembles the real world and may converge towards a non-probabilistic information source
Fast and Simple Relational Processing of Uncertain Data
This paper introduces U-relations, a succinct and purely relational
representation system for uncertain databases. U-relations support
attribute-level uncertainty using vertical partitioning. If we consider
positive relational algebra extended by an operation for computing possible
answers, a query on the logical level can be translated into, and evaluated as,
a single relational algebra query on the U-relation representation. The
translation scheme essentially preserves the size of the query in terms of
number of operations and, in particular, number of joins. Standard techniques
employed in off-the-shelf relational database management systems are effective
for optimizing and processing queries on U-relations. In our experiments we
show that query evaluation on U-relations scales to large amounts of data with
high degrees of uncertainty.Comment: 12 pages, 14 figure
Term-Specific Eigenvector-Centrality in Multi-Relation Networks
Fuzzy matching and ranking are two information retrieval techniques widely used in web search. Their application to structured data, however, remains an open problem. This article investigates how eigenvector-centrality can be used for approximate matching in multi-relation graphs, that is, graphs where connections of many different types may exist. Based on an extension of the PageRank matrix, eigenvectors representing the distribution of a term after propagating term weights between related data items are computed. The result is an index which takes the document structure into account and can be used with standard document retrieval techniques. As the scheme takes the shape of an index transformation, all necessary calculations are performed during index tim
Comparison of Gaussian ARTMAP and the EM Algorithm
Gaussian ARTMAP (GAM) is a supervised-learning adaptive resonance theory (ART) network that uses Gaussian-defined receptive fields. Like other ART networks, GAM incrementally learns and constructs a representation of sufficient complexity to solve a problem it is trained on. GAM's representation is a Gaussian mixture model of the input space, with learned mappings from the mixture components to output classes. We show a close relationship between GAM and the well-known Expectation-Maximization (EM) approach to mixture-modeling. GAM outperforms an EM classification algorithm on a classification benchmark, thereby demonstrating the advantage of the ART match criterion for regulating learning, and the ARTMAP match tracking operation for incorporate environmental feedback in supervised learning situations.Office of Naval Research (N00014-95-1-0409
Evolutionary constraints on the complexity of genetic regulatory networks allow predictions of the total number of genetic interactions
Genetic regulatory networks (GRNs) have been widely studied, yet there is a
lack of understanding with regards to the final size and properties of these
networks, mainly due to no network currently being complete. In this study, we
analyzed the distribution of GRN structural properties across a large set of
distinct prokaryotic organisms and found a set of constrained characteristics
such as network density and number of regulators. Our results allowed us to
estimate the number of interactions that complete networks would have, a
valuable insight that could aid in the daunting task of network curation,
prediction, and validation. Using state-of-the-art statistical approaches, we
also provided new evidence to settle a previously stated controversy that
raised the possibility of complete biological networks being random and
therefore attributing the observed scale-free properties to an artifact
emerging from the sampling process during network discovery. Furthermore, we
identified a set of properties that enabled us to assess the consistency of the
connectivity distribution for various GRNs against different alternative
statistical distributions. Our results favor the hypothesis that highly
connected nodes (hubs) are not a consequence of network incompleteness.
Finally, an interaction coverage computed for the GRNs as a proxy for
completeness revealed that high-throughput based reconstructions of GRNs could
yield biased networks with a low average clustering coefficient, showing that
classical targeted discovery of interactions is still needed.Comment: 28 pages, 5 figures, 12 pages supplementary informatio
- …