8,744 research outputs found
Augmented Rotation-Based Transformation for Privacy-Preserving Data Clustering
Multiple rotation-based transformation (MRBT) was introduced recently for
mitigating the apriori-knowledge independent component analysis (AK-ICA) attack
on rotation-based transformation (RBT), which is used for privacy-preserving
data clustering. MRBT is shown to mitigate the AK-ICA attack but at the expense
of data utility by not enabling conventional clustering. In this paper, we
extend the MRBT scheme and introduce an augmented rotation-based transformation
(ARBT) scheme that utilizes linearity of transformation and that both mitigates
the AK-ICA attack and enables conventional clustering on data subsets
transformed using the MRBT. In order to demonstrate the computational
feasibility aspect of ARBT along with RBT and MRBT, we develop a toolkit and
use it to empirically compare the different schemes of privacy-preserving data
clustering based on data transformation in terms of their overhead and privacy.Comment: 11 pages, 11 figures, and 6 table
Privacy in Social Media: Identification, Mitigation and Applications
The increasing popularity of social media has attracted a huge number of
people to participate in numerous activities on a daily basis. This results in
tremendous amounts of rich user-generated data. This data provides
opportunities for researchers and service providers to study and better
understand users' behaviors and further improve the quality of the personalized
services. Publishing user-generated data risks exposing individuals' privacy.
Users privacy in social media is an emerging task and has attracted increasing
attention in recent years. These works study privacy issues in social media
from the two different points of views: identification of vulnerabilities, and
mitigation of privacy risks. Recent research has shown the vulnerability of
user-generated data against the two general types of attacks, identity
disclosure and attribute disclosure. These privacy issues mandate social media
data publishers to protect users' privacy by sanitizing user-generated data
before publishing it. Consequently, various protection techniques have been
proposed to anonymize user-generated social media data. There is a vast
literature on privacy of users in social media from many perspectives. In this
survey, we review the key achievements of user privacy in social media. In
particular, we review and compare the state-of-the-art algorithms in terms of
the privacy leakage attacks and anonymization algorithms. We overview the
privacy risks from different aspects of social media and categorize the
relevant works into five groups 1) graph data anonymization and
de-anonymization, 2) author identification, 3) profile attribute disclosure, 4)
user location and privacy, and 5) recommender systems and privacy issues. We
also discuss open problems and future research directions for user privacy
issues in social media.Comment: This survey is currently under revie
A Novel Framework using Elliptic Curve Cryptography for Extremely Secure Transmission in Distributed Privacy Preserving Data Mining
Privacy Preserving Data Mining is a method which ensures privacy of
individual information during mining. Most important task involves retrieving
information from multiple data bases which is distributed. The data once in the
data warehouse can be used by mining algorithms to retrieve confidential
information. The proposed framework has two major tasks, secure transmission
and privacy of confidential information during mining. Secure transmission is
handled by using elliptic curve cryptography and data distortion for privacy
preservation ensuring highly secure environment.Comment: 8 page
A Unified Framework for Clustering Constrained Data without Locality Property
In this paper, we consider a class of constrained clustering problems of
points in , where could be rather high. A common feature of
these problems is that their optimal clusterings no longer have the locality
property (due to the additional constraints), which is a key property required
by many algorithms for their unconstrained counterparts. To overcome the
difficulty caused by the loss of locality, we present in this paper a unified
framework, called {\em Peeling-and-Enclosing (PnE)}, to iteratively solve two
variants of the constrained clustering problems, {\em constrained -means
clustering} (-CMeans) and {\em constrained -median clustering}
(-CMedian). Our framework is based on two standalone geometric techniques,
called {\em Simplex Lemma} and {\em Weaker Simplex Lemma}, for -CMeans and
-CMedian, respectively. The simplex lemma (or weaker simplex lemma) enables
us to efficiently approximate the mean (or median) point of an unknown set of
points by searching a small-size grid, independent of the dimensionality of the
space, in a simplex (or the surrounding region of a simplex), and thus can be
used to handle high dimensional data. If and are fixed
numbers, our framework generates, in nearly linear time ({\em i.e.,} ), -tuple candidates for the mean or median
points, and one of them induces a -approximation for -CMeans
or -CMedian, where is the number of points. Combining this unified
framework with a problem-specific selection algorithm (which determines the
best -tuple candidate), we obtain a -approximation for each of
the constrained clustering problems. We expect that our technique will be
applicable to other constrained clustering problems without locality
Secure Mining of Association Rules in Horizontally Distributed Databases
We propose a protocol for secure mining of association rules in horizontally
distributed databases. The current leading protocol is that of Kantarcioglu and
Clifton (TKDE 2004). Our protocol, like theirs, is based on the Fast
Distributed Mining (FDM) algorithm of Cheung et al. (PDIS 1996), which is an
unsecured distributed version of the Apriori algorithm. The main ingredients in
our protocol are two novel secure multi-party algorithms --- one that computes
the union of private subsets that each of the interacting players hold, and
another that tests the inclusion of an element held by one player in a subset
held by another. Our protocol offers enhanced privacy with respect to the
protocol of Kantarcioglu and Clifton. In addition, it is simpler and is
significantly more efficient in terms of communication rounds, communication
cost and computational cost
A Survey on Software-Defined VANETs: Benefits, Challenges, and Future Directions
The evolving of Fifth Generation (5G) networks isbecoming more readily
available as a major driver of the growthof new applications and business
models. Vehicular Ad hocNetworks (VANETs) and Software Defined Networking
(SDN)represent the key enablers of 5G technology with the developmentof next
generation intelligent vehicular networks and applica-tions. In recent years,
researchers have focused on the integrationof SDN and VANET, and look at
different topics related to thearchitecture, the benefits of software-defined
VANET servicesand the new functionalities to adapt them. However, securityand
robustness of the complete architecture is still questionableand have been
largely negleted. Moreover, the deployment andintegration of novel entities and
several architectural componentsdrive new security threats and
vulnerabilities.In this paper, first we survey the state-of-the-art SDN
basedVehicular ad-hoc Network (SDVN) architectures for their net-working
infrastructure design, functionalities, benefits, and chal-lenges. Then we
discuss these SDVN architectures against majorsecurity threats that violate the
key security services such asavailability, confidentiality, authentication, and
data integrity.We also propose different countermeasures to these
threats.Finally, we discuss the lessons learned with the directions offuture
research work towards provisioning stringent security andprivacy solutions in
future SDVN architectures. To the best of ourknowledge, this is the first
comprehensive work that presents sucha survey and analysis on SDVNs in the era
of future generationnetworks (e.g., 5G, and Information centric networking)
andapplications (e.g., intelligent transportation system, and IoT-enabled
advertising in VANETs).Comment: 17 pages, 2 figure
Holistic Collaborative Privacy Framework for Users' Privacy in Social Recommender Service
The current business model for existing recommender services is centered
around the availability of users' personal data at their side whereas consumers
have to trust that the recommender service providers will not use their data in
a malicious way. With the increasing number of cases for privacy breaches,
different countries and corporations have issued privacy laws and regulations
to define the best practices for the protection of personal information. The
data protection directive 95/46/EC and the privacy principles established by
the Organization for Economic Cooperation and Development (OECD) are examples
of such regulation frameworks. In this paper, we assert that utilizing
third-party recommender services to generate accurate referrals are feasible,
while preserving the privacy of the users' sensitive information which will be
residing on a clear form only on his/her own device. As a result, each user who
benefits from the third-party recommender service will have absolute control
over what to release from his/her own preferences. We proposed a collaborative
privacy middleware that executes a two stage concealment process within a
distributed data collection protocol in order to attain this claim.
Additionally, the proposed solution complies with one of the common privacy
regulation frameworks for fair information practice in a natural and functional
way -which is OECD privacy principles. The approach presented in this paper is
easily integrated into the current business model as it is implemented using a
middleware that runs at the end-users side and utilizes the social nature of
content distribution services to implement a topological data collection
protocol
Parallel and Distributed Collaborative Filtering: A Survey
Collaborative filtering is amongst the most preferred techniques when
implementing recommender systems. Recently, great interest has turned towards
parallel and distributed implementations of collaborative filtering algorithms.
This work is a survey of the parallel and distributed collaborative filtering
implementations, aiming not only to provide a comprehensive presentation of the
field's development, but also to offer future research orientation by
highlighting the issues that need to be further developed.Comment: 46 page
Security and Privacy Issues in Deep Learning
With the development of machine learning (ML), expectations for artificial
intelligence (AI) technology have been increasing daily. In particular, deep
neural networks have shown outstanding performance results in many fields. Many
applications are deeply involved in our daily life, such as making significant
decisions in application areas based on predictions or classifications, in
which a DL model could be relevant. Hence, if a DL model causes mispredictions
or misclassifications due to malicious external influences, then it can cause
very large difficulties in real life. Moreover, training DL models involve an
enormous amount of data and the training data often include sensitive
information. Therefore, DL models should not expose the privacy of such data.
In this paper, we review the vulnerabilities and the developed defense methods
on the security of the models and data privacy under the notion of secure and
private AI (SPAI). We also discuss current challenges and open issues
Scalable attribute-aware network embedding with locality
Adding attributes for nodes to network embedding helps to improve the ability
of the learned joint representation to depict features from topology and
attributes simultaneously. Recent research on the joint embedding has exhibited
a promising performance on a variety of tasks by jointly embedding the two
spaces. However, due to the indispensable requirement of globality based
information, present approaches contain a flaw of in-scalability. Here we
propose \emph{SANE}, a scalable attribute-aware network embedding algorithm
with locality, to learn the joint representation from topology and attributes.
By enforcing the alignment of a local linear relationship between each node and
its K-nearest neighbors in topology and attribute space, the joint embedding
representations are more informative comparing with a single representation
from topology or attributes alone. And we argue that the locality in
\emph{SANE} is the key to learning the joint representation at scale. By using
several real-world networks from diverse domains, We demonstrate the efficacy
of \emph{SANE} in performance and scalability aspect. Overall, for performance
on label classification, SANE successfully reaches up to the highest F1-score
on most datasets, and even closer to the baseline method that needs label
information as extra inputs, compared with other state-of-the-art joint
representation algorithms. What's more, \emph{SANE} has an up to 71.4\%
performance gain compared with the single topology-based algorithm. For
scalability, we have demonstrated the linearly time complexity of \emph{SANE}.
In addition, we intuitively observe that when the network size scales to
100,000 nodes, the "learning joint embedding" step of \emph{SANE} only takes
seconds
- …