12,917 research outputs found
Differential Privacy in Metric Spaces: Numerical, Categorical and Functional Data Under the One Roof
We study Differential Privacy in the abstract setting of Probability on
metric spaces. Numerical, categorical and functional data can be handled in a
uniform manner in this setting. We demonstrate how mechanisms based on data
sanitisation and those that rely on adding noise to query responses fit within
this framework. We prove that once the sanitisation is differentially private,
then so is the query response for any query. We show how to construct
sanitisations for high-dimensional databases using simple 1-dimensional
mechanisms. We also provide lower bounds on the expected error for
differentially private sanitisations in the general metric space setting.
Finally, we consider the question of sufficient sets for differential privacy
and show that for relaxed differential privacy, any algebra generating the
Borel -algebra is a sufficient set for relaxed differential privacy.Comment: 18 Page
Ontological Foundations of Data Modeling in Information Systems
In this paper we present propositions which we have argued elsewhere concerning ontology and data models. Additionally, we present evidence relating to our propositions. We have found that Chisholm’s ontology has the potential to be a unifying theory for data models. In addition, our research has lead us to the position that ontologies founded in the philosophical tradition of realism seem to serve the purpose of a unifying framework for data models. Further, we have seen the realistic ontologies by Mario Bunge and Roderick Chisholm used in information systems. We believe that realistic ontologies have a role to play in understanding information systems
Context-dependent feature analysis with random forests
In many cases, feature selection is often more complicated than identifying a
single subset of input variables that would together explain the output. There
may be interactions that depend on contextual information, i.e., variables that
reveal to be relevant only in some specific circumstances. In this setting, the
contribution of this paper is to extend the random forest variable importances
framework in order (i) to identify variables whose relevance is
context-dependent and (ii) to characterize as precisely as possible the effect
of contextual information on these variables. The usage and the relevance of
our framework for highlighting context-dependent variables is illustrated on
both artificial and real datasets.Comment: Accepted for presentation at UAI 201
- …