Search CORE

481 research outputs found

Qualitative Effects of Knowledge Rules in Probabilistic Data Integration

Author: Keijzer A. de
Keulen M. van
Publication venue: Centre for Telematics and Information Technology, University of Twente
Publication date: 01/01/2008
Field of study

One of the problems in data integration is data overlap: the fact that different data sources have data on the same real world entities. Much development time in data integration projects is devoted to entity resolution. Often advanced similarity measurement techniques are used to remove semantic duplicates from the integration result or solve other semantic conflicts, but it proofs impossible to get rid of all semantic problems in data integration. An often-used rule of thumb states that about 90% of the development effort is devoted to solving the remaining 10% hard cases. In an attempt to significantly decrease human effort at data integration time, we have proposed an approach that stores any remaining semantic uncertainty and conflicts in a probabilistic database enabling it to already be meaningfully used. The main development effort in our approach is devoted to defining and tuning knowledge rules and thresholds. Rules and thresholds directly impact the size and quality of the integration result. We measure integration quality indirectly by measuring the quality of answers to queries on the integrated data set in an information retrieval-like way. The main contribution of this report is an experimental investigation of the effects and sensitivity of rule definition and threshold tuning on the integration quality. This proves that our approach indeed reduces development effort — and not merely shifts the effort to rule definition and threshold tuning — by showing that setting rough safe thresholds and defining only a few rules suffices to produce a ‘good enough’ integration that can be meaningfully used

CiteSeerX

University of Twente Research Information

Quality Measures in Uncertain Data Management

Author: Keijzer A. de
Keulen M. van
Publication venue: Springer Verlag
Publication date: 01/01/2007
Field of study

Many applications deal with data that is uncertain. Some examples are applications dealing with sensor information, data integration applications and healthcare applications. Instead of these applications having to deal with the uncertainty, it should be the responsibility of the DBMS to manage all data including uncertain data. Several projects do research on this topic. In this paper, we introduce four measures to be used to assess and compare important characteristics of data and systems

University of Twente Research Information

User Feedback in Probabilistic XML

Author: Keijzer A. de
Keulen M. van
Publication venue: Centre for Telematics and Information Technology, University of Twente
Publication date: 01/01/2007
Field of study

Data integration is a challenging problem in many application areas. Approaches mostly attempt to resolve semantic uncertainty and conflicts between information sources as part of the data integration process. In some application areas, this is impractical or even prohibitive, for example, in an ambient environment where devices on an ad hoc basis have to exchange information autonomously. We have proposed a probabilistic XML approach that allows data integration without user involvement by storing semantic uncertainty and conflicts in the integrated XML data. As a\ud consequence, the integrated information source represents\ud all possible appearances of objects in the real world, the\ud so-called possible worlds.\ud \ud In this paper, we show how user feedback on query results\ud can resolve semantic uncertainty and conflicts in the\ud integrated data. Hence, user involvement is effectively postponed to query time, when a user is already interacting actively with the system. The technique relates positive and\ud negative statements on query answers to the possible worlds\ud of the information source thereby either reinforcing, penalizing, or eliminating possible worlds. We show that after repeated user feedback, an integrated information source better resembles the real world and may converge towards a non-probabilistic information source

University of Twente Research Information

How Noisy Data Affects Geometric Semantic Genetic Programming

Author: Falco I. De
Keijzer M.
Kharin Yu.
Koza J. R.
Sáez J. A.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 04/07/2017
Field of study

Noise is a consequence of acquiring and pre-processing data from the environment, and shows fluctuations from different sources---e.g., from sensors, signal processing technology or even human error. As a machine learning technique, Genetic Programming (GP) is not immune to this problem, which the field has frequently addressed. Recently, Geometric Semantic Genetic Programming (GSGP), a semantic-aware branch of GP, has shown robustness and high generalization capability. Researchers believe these characteristics may be associated with a lower sensibility to noisy data. However, there is no systematic study on this matter. This paper performs a deep analysis of the GSGP performance over the presence of noise. Using 15 synthetic datasets where noise can be controlled, we added different ratios of noise to the data and compared the results obtained with those of a canonical GP. The results show that, as we increase the percentage of noisy instances, the generalization performance degradation is more pronounced in GSGP than GP. However, in general, GSGP is more robust to noise than GP in the presence of up to 10% of noise, and presents no statistical difference for values higher than that in the test bed.Comment: 8 pages, In proceedings of Genetic and Evolutionary Computation Conference (GECCO 2017), Berlin, German

arXiv.org e-Print Archive

Crossref

Effects of flywheel training on strength-related variables in female populations. A systematic review

Author: Beato M.
Beato M.
Bishop C.
Bishop C.
de Keijzer K.
de Keijzer K.
Raya-González J.
Raya-González J.
Publication venue: Taylor and Francis (Routledge)
Publication date: 01/01/2022
Field of study

This study aimed to evaluate the effect of flywheel training on female populations, report practical recommendations for practitioners based on the currently available evidence, underline the limitations of current literature, and establish future research directions. Studies were searched through the electronic databases (PubMed, SPORTDiscus, and Web of Science) following the preferred reporting items for systematic reviews and meta-analysis statement guidelines. The methodological quality of the seven studies included in this review ranged from 10 to 19 points (good to excellent), with an average score of 14-points (good). These studies were carried out between 2004 and 2019 and comprised a total of 100 female participants. The training duration ranged from 5 weeks to 24 weeks, with volume ranging from 1 to 4 sets and 7 to 12 repetitions, and frequency ranged from 1 to 3 times a week. The contemporary literature suggests that flywheel training is a safe and time-effective strategy to enhance physical outcomes with young and elderly females. With this information, practitioners may be inclined to prescribe flywheel training as an effective countermeasure for injuries or falls and as potent stimulus for physical enhancement

Middlesex University Research Repository

Trio-One: Layering Uncertainty and Lineage on a Conventional DBMS

Author: Agrawal P.
Benjelloun O.
Das Sarma A.
Keijzer A. de
Murthy R.
Mutsuzaki M.
Sugihara T.
Theobald M.
Widom J.
Publication venue
Publication date: 01/01/2007
Field of study

Trio is a new kind of database system that supports data, uncertainty, and lineage in a fully integrated manner. The first Trio prototype, dubbed Trio-One, is built on top of a conventional DBMS using data and query translation techniques together with a small number of stored procedures. This paper describes Trio-One's translation scheme and system architecture, showing how it efficiently and easily supports the Trio data model and query language

CiteSeerX

DBIS EPub

University of Twente Research Information

MPG.PuRe

Efficient Equilibria in Polymatrix Coordination Games

Author: A Bogomolnaia
B Keijzer de
D Monderer
E Koutsoupias
G Christodoulou
I Caragiannis
KR Apt
M Yannakakis
RJ Aumann
Y Bachrach
Publication venue
Publication date: 01/01/2015
Field of study

We consider polymatrix coordination games with individual preferences where every player corresponds to a node in a graph who plays with each neighbor a separate bimatrix game with non-negative symmetric payoffs. In this paper, we study

\alpha

-approximate

k

-equilibria of these games, i.e., outcomes where no group of at most

k

players can deviate such that each member increases his payoff by at least a factor

\alpha

. We prove that for

\alpha \ge 2

these games have the finite coalitional improvement property (and thus

\alpha

-approximate

k

-equilibria exist), while for

\alpha < 2

this property does not hold. Further, we derive an almost tight bound of

2\alpha(n-1)/(k-1)

on the price of anarchy, where

n

is the number of players; in particular, it scales from unbounded for pure Nash equilibria (

k = 1)

2\alpha

for strong equilibria (

k = n

). We also settle the complexity of several problems related to the verification and existence of these equilibria. Finally, we investigate natural means to reduce the inefficiency of Nash equilibria. Most promisingly, we show that by fixing the strategies of

k

players the price of anarchy can be reduced to

n/k

(and this bound is tight)

arXiv.org e-Print Archive

Crossref

VU Research Portal

CWI's Institutional Repository

Schrijven over cassatierechtspraak

Author: Graaff R. (Ruben) de
Keijzer T.A. (Titiaan)
Kluiver C.C. (Charlotte) de
Samadi M. (Mojan)
Publication venue
Publication date: 01/01/2016
Field of study

Coherent privaatrech

EUR Research Repository

Leiden University Scholary Publications

Erasmus University Digital Repository

Hernieuwde aandacht voor het tuchtrecht

Author: Graaff R. (Ruben) de
Keijzer T.A. (Titiaan)
Kluiver C.C. (Charlotte) de
Samadi M. (Mojan)
Publication venue
Publication date: 01/08/2016
Field of study

Erasmus University Digital Repository

Nearly optimal solutions for the Chow Parameters Problem and low-weight approximation of halfspaces

Author: Anindya De
Aziz H.
Banzhaf J.
Cheraghchi M.
de Keijzer B.
Dertouzos M.
Feldman V.
Feldman V.
Felsenthal D.
Freixas J.
Ilias Diakonikolas
Muroga S.
Rocco A. Servedio
Takamiya K.
Tannenbaum M.
Vitaly Feldman
Winder R. O.
Publication venue
Publication date: 05/06/2012
Field of study

The \emph{Chow parameters} of a Boolean function

f: \{-1,1\}^n \to \{-1,1\}

are its

n+1

degree-0 and degree-1 Fourier coefficients. It has been known since 1961 (Chow, Tannenbaum) that the (exact values of the) Chow parameters of any linear threshold function

f

uniquely specify

f

within the space of all Boolean functions, but until recently (O'Donnell and Servedio) nothing was known about efficient algorithms for \emph{reconstructing}

f

(exactly or approximately) from exact or approximate values of its Chow parameters. We refer to this reconstruction problem as the \emph{Chow Parameters Problem.} Our main result is a new algorithm for the Chow Parameters Problem which, given (sufficiently accurate approximations to) the Chow parameters of any linear threshold function

f

, runs in time \tilde{O}(n^2)\cdot (1/\eps)^{O(\log^2(1/\eps))} and with high probability outputs a representation of an LTF

f'

that is \eps-close to

f

. The only previous algorithm (O'Donnell and Servedio) had running time \poly(n) \cdot 2^{2^{\tilde{O}(1/\eps^2)}}. As a byproduct of our approach, we show that for any linear threshold function

f

over

\{-1,1\}^n

, there is a linear threshold function

f'

which is \eps-close to

f

and has all weights that are integers at most \sqrt{n} \cdot (1/\eps)^{O(\log^2(1/\eps))}. This significantly improves the best previous result of Diakonikolas and Servedio which gave a \poly(n) \cdot 2^{\tilde{O}(1/\eps^{2/3})} weight bound, and is close to the known lower bound of

\max\{\sqrt{n},

(1/\eps)^{\Omega(\log \log (1/\eps))}\} (Goldberg, Servedio). Our techniques also yield improved algorithms for related problems in learning theory

arXiv.org e-Print Archive

CiteSeerX

Crossref

Edinburgh Research Explorer