6 research outputs found
Privacy-preserving friend recommendations in online social networks
Online social networks, such as Facebook and Google+, have been emerging as a new communication service for users to stay in touch and share information with family members and friends over the Internet. Since the users are generating huge amounts of data on social network sites, an interesting question is how to mine this enormous amount of data to retrieve useful information. Along this direction, social network analysis has emerged as an important tool for many business intelligence applications such as identifying potential customers and promoting items based on their interests. In particular, since users are often interested to make new friends, a friend recommendation application provides the medium for users to expand his/her social connections and share information of interest with more friends. Besides this, it also helps to enhance the development of the entire network structure. The existing friend recommendation methods utilize social network structure and/or user profile information. However, these methods can no longer be applicable if the privacy of users is taken into consideration. This work introduces a set of privacy-preserving friend recommendation protocols based on different existing similarity metrics in the literature. Briefly, depending on the underlying similarity metric used, the proposed protocols guarantee the privacy of a user\u27s personal information such as friend lists. These protocols are the first to make the friend recommendation process possible in privacy-enhanced social networking environments. Also, this work considers the case of outsourced social networks, where users\u27 profile data are encrypted and outsourced to third-party cloud providers who provide social networking services to the users. Under such an environment, this work proposes novel protocols for the cloud to do friend recommendations in a privacy-preserving manner --Abstract, page iii
New Fundamental Technologies in Data Mining
The progress of data mining technology and large public popularity establish a need for a comprehensive text on the subject. The series of books entitled by "Data Mining" address the need by presenting in-depth description of novel mining algorithms and many useful applications. In addition to understanding each section deeply, the two books present useful hints and strategies to solving problems in the following chapters. The contributing authors have highlighted many future research directions that will foster multi-disciplinary collaborations and hence will lead to significant development in the field of data mining
A Scalable Blocking Framework for Multidatabase Privacy-preserving Record Linkage
Today many application domains, such as national statistics,
healthcare, business analytic, fraud detection, and national
security, require data to be integrated from multiple databases.
Record linkage (RL) is a process used in data integration which
links multiple databases to identify matching records that belong
to the same entity. RL enriches the usefulness of data by
removing duplicates, errors, and inconsistencies which improves
the effectiveness of decision making in data analytic
applications.
Often, organisations are not willing or authorised to share the
sensitive information in their databases with any other party due
to privacy and confidentiality regulations. The linkage of
databases of different organisations is an emerging research area
known as privacy-preserving record linkage (PPRL). PPRL
facilitates the linkage of databases by ensuring the privacy of
the entities in these databases.
In multidatabase (MD) context, PPRL is significantly challenged
by the intrinsic exponential growth in the number of potential
record pair comparisons. Such linkage often requires significant
time and computational resources to produce the resulting
matching sets of records. Due to increased risk of collusion,
preserving the privacy of the data is more problematic with an
increase of number of parties involved in the linkage process.
Blocking is commonly used to scale the linkage of large
databases. The aim of blocking is to remove those record pairs
that correspond to non-matches (refer to different entities).
Many techniques have been proposed for RL and PPRL for blocking
two databases. However, many of these techniques are not suitable
for blocking multiple databases. This creates a need to develop
blocking technique for the multidatabase linkage context as
real-world applications increasingly require more than two
databases.
This thesis is the first to conduct extensive research on
blocking for multidatabase privacy-preserved record linkage
(MD-PPRL). We consider several research problems in blocking of
MD-PPRL. First, we start with a broad background literature on
PPRL. This allow us to identify the main research gaps that need
to be investigated in MD-PPRL. Second, we introduce a blocking
framework for MD-PPRL which provides more flexibility and control
to database owners in the block generation process. Third, we
propose different techniques that are used in our framework for
(1) blocking of multiple databases, (2) identifying blocks that
need to be compared across subgroups of these databases, and (3)
filtering redundant record pair comparisons by the efficient
scheduling of block comparisons to improve the scalability of
MD-PPRL. Each of these techniques covers an important aspect of
blocking in real-world MD-PPRL applications. Finally, this thesis
reports on an extensive evaluation of the combined application of
these methods with real datasets, which illustrates that they
outperform existing approaches in term of scalability, accuracy,
and privacy