Search CORE

69 research outputs found

(α, k)-anonymous data publishing

Author: A. Hundepool
Ada Fu
D. Agrawal
I. Holyer
Jiuyong Li
K. Wang
Ke Wang
L. Cox
L. Sweeney
L. Sweeney
P. Samarati
R. Agrawal
Raymond Wong
U. M. Fayyad
V. S. Verykios
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Mining probabilistic automata: a statistical view of sequential pattern mining

Author: A. S. Reber
A. V. Evfimievski
C. Higuera de la
E. M. Gold
E. M. Newton
François Jacquenet
G. I. Webb
H. Mannila
J. Ayres
J. Borges
J. Han
J. Pei
J. Shaffer
K. Pearson
L. G. Valiant
L. Sweeney
M. Garofalakis
M. J. Zaki
M. J. Zaki
M. Klemettinen
M. Spiliopoulou
Marc Sebban
P. Dupont
P. Laur
P. Laur
R. A. Fisher
R. Agrawal
R. Agrawal
R. C. Carrasco
R. J. Bayardo
R. Kosala
R. Srikant
S. Holm
Stéphanie Jacquemont
V. S. Verykios
W. Hoeffding
Y. Benjamini
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Cost Optimal Record/Entity Matching

Author: Elmagarmid Ahmed K.
Moustakides G. V.
Verykios V. S.
Publication venue: 'Purdue University (bepress)'
Publication date: 01/07/2001
Field of study

Purdue E-Pubs

A generalized cost optimal decision model for record matching

Author: Moustakides G. V.
Verykios V. S.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2004
Field of study

Record (or entity) matching or linkage is the process of identifying records in one or more data sources, that refer to the same real world entity or object. In record linkage, the ultimate goal of a decision model is to provide the decision maker with a tool for making decisions upon the actual matching status of a pair of records (i.e., documents, events, persons, cases, etc.). Existing models of record linkage rely on decision rules that minimize the probability of subjecting a case to clerical review, conditional on the probabilities of erroneous matches and erroneous non-matches. In practice though, (a) the value of an erroneous match is, in many applications, quite different from the value of an erroneous non-match, and (b) the cost and the probability of a misclassification, which is associated with the clerical review, is ignored in this way. In this paper, we present a decision model which is optimal, based on the cost of the record linkage operation, and general enough to accommodate multi-class or multi-decision case studies. We also present an example along with the results from applying the proposed model to large comparison spaces. ©2004 ACM

University of Thessaly Institutional Repository

Record Matching: Past, Present and Future

Author: Cochinwala M.
Dalal S.
Elmagarmid Ahmed K.
Verykios V. S.
Publication venue: 'Purdue University (bepress)'
Publication date: 01/07/2001
Field of study

Purdue E-Pubs

A max-min approach for hiding frequent itemsets

Author: Moustakides G. V.
Verykios V. S.
Publication venue
Publication date: 01/01/2006
Field of study

In this paper we are proposing a new algorithmic approach for sanitizing raw data from sensitive knowledge in the context of mining of association rules. The new approach (a) relies on the maxmin criterion which is a method in decision theory for maximizing the minimum gain and (b) builds upon the border theory of frequent itemsets. © 2006 IEEE

University of Thessaly Institutional Repository

Reference table based k-anonymous private blocking

Author: Karakasidis A.
Verykios V. S.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2012
Field of study

Privacy Preserving Record Linkage is an emerging field of research which attempts to deal with the classical linkage problem from a privacy preserving point of view. In this paper we propose a novel approach for performing Privacy Preserving Blocking in order to minimize the computational cost of Privacy Preserving Record Linkage. We achieve this without compromising privacy by using Nearest Neighbors clustering, a well-known clustering algorithm and by using a reference table. A reference table is a publicly known table the contents of which are used as intermediate references. The combination of Nearest Neighbors and a reference table offers our approach k-anonymity characteristics. © 2012 ACM

Crossref

University of Thessaly Institutional Repository

Privacy preserving record linkage using phonetic codes

Author: Karakasidis A.
Verykios V. S.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2009
Field of study

Phonetic codes such as Soundex and Metaphone have been used in the past to address the Record Linkage Problem. However, to the best of our knowledge, no particular effort has been made within this context towards privacy assurance during the matching process. Phonetic codes have an interesting feature which can be cornerstone to providing privacy. They are mappings of strings which do not exhibit the one-to-one property. In this paper, we present a novel protocol for achieving privacy preserving record linkage using phonetics, we provide proof of correctness for our approach and finally we illustrate experimental results concerning performance and matching accuracy. The proposed protocol can be equally well applied to codes different than the phonetic ones, which do not exhibit the one-to-one property, such as hash tables with comparable results. © 2009 IEEE

University of Thessaly Institutional Repository

EXACT KNOWLEDGE HIDING IN TRANSACTIONAL DATABASES

Author: Gkoulalas-Divanis A.
Verykios V. S.
Publication venue
Publication date: 01/01/2009
Field of study

The hiding of sensitive knowledge in the form of frequent itemsets, has gained increasing attention over the past years. This paper highlights the process of border revision, which is essential for the identification of hiding solutions bearing no side-effects, and provides efficient algorithms for the computation of the revised positive and the revised negative borders. By utilizing border revision, we unify the theory behind two exact hiding algorithms that guarantee optimal solutions both in terms of database distortion and side-effects introduced by the hiding process. Following that, we propose a novel extension to one of the hiding algorithms that allows it to identify exact hiding solutions to a much wider range of problems (than its original counterpart). Through experimentation, we compare the exact hiding schemes against two state-of-the-art heuristic algorithms and demonstrate their ability to consistently provide solutions of higher quality to a wide variety of hiding problems

University of Thessaly Institutional Repository