17 research outputs found

    All liaisons are dangerous when all your friends are known to us

    Get PDF
    Online Social Networks (OSNs) are used by millions of users worldwide. Academically speaking, there is little doubt about the usefulness of demographic studies conducted on OSNs and, hence, methods to label unknown users from small labeled samples are very useful. However, from the general public point of view, this can be a serious privacy concern. Thus, both topics are tackled in this paper: First, a new algorithm to perform user profiling in social networks is described, and its performance is reported and discussed. Secondly, the experiments --conducted on information usually considered sensitive-- reveal that by just publicizing one's contacts privacy is at risk and, thus, measures to minimize privacy leaks due to social graph data mining are outlined.Comment: 10 pages, 5 table

    How to Hide One's Relationships from Link Prediction Algorithms

    Get PDF
    Our private connections can be exposed by link prediction algorithms. To date, this threat has only been addressed from the perspective of a central authority, completely neglecting the possibility that members of the social network can themselves mitigate such threats. We fill this gap by studying how an individual can rewire her own network neighborhood to hide her sensitive relationships. We prove that the optimization problem faced by such an individual is NP-complete, meaning that any attempt to identify an optimal way to hide one’s relationships is futile. Based on this, we shift our attention towards developing effective, albeit not optimal, heuristics that are readily-applicable by users of existing social media platforms to conceal any connections they deem sensitive. Our empirical evaluation reveals that it is more beneficial to focus on “unfriending” carefully-chosen individuals rather than befriending new ones. In fact, by avoiding communication with just 5 individuals, it is possible for one to hide some of her relationships in a massive, real-life telecommunication network, consisting of 829,725 phone calls between 248,763 individuals. Our analysis also shows that link prediction algorithms are more susceptible to manipulation in smaller and denser networks. Evaluating the error vs. attack tolerance of link prediction algorithms reveals that rewiring connections randomly may end up exposing one’s sensitive relationships, highlighting the importance of the strategic aspect. In an age where personal relationships continue to leave digital traces, our results empower the general public to proactively protect their private relationships.M.W. was supported by the Polish National Science Centre grant 2015/17/N/ST6/03686. T.P.M. was supported by the Polish National Science Centre grants 2016/23/B/ST6/03599 and 2014/13/B/ST6/01807 (for this and the previous versions of this article, respectively). Y.V. and K.Z. were supported by ARO MURI (grant #W911NF1810208). Y.V. was also supported by the U.S. National Science Foundation (CAREER award IIS- 1905558 and grant IIS-1526860). E.M. acknowledges funding by Ministerio de Economa y Competitividad (Spain) through grant FIS2016-78904-C3-3-P

    Trust assessment of account information services providers in Portugal : Banks, Bigtechs, and Fintechs

    Get PDF
    Account Information Services (AIS) enable users to consolidate all of their payment accounts information in a single platform. Banks, Bigtechs, and Fintechs are the main candidates to compete in the AIS market. Banks argue that consumers’ trust puts them in a favourable position to dominate this market. However, since the recent global financial crisis the level of trust in banks is considered debatable. The main purpose of this research is to conclude if banks are right to hold consumers’ trust as a positive differentiator from other providers. After a careful assessment of the elements determinant to the levels of trust in financial services, the usage of a primary data source allowed this research to compare different scores between institutions, not only regarding overall trust, but also in each determinant. Furthermore, this study measured the strength of the correlations between determinants and the overall institutional scores, as well as between individual’s trust in the financial system and in each institution. The results showed that there is no apparent sign that banks are about to be overthrown as AIS market leaders due to the levels of trust in financial services providers. However, individuals’ strong association of banks with the system, along with the highest score of Bigtechs in several determinants, and Fintechs opportunity from their image’s disconnection with the system, can make the outlook change in the near future.Os Serviços de Informação sobre Contas (AIS) que permitem aos usuários consolidar todas as suas informações de contas de pagamento agregadas numa plataforma. Bancos, Bigtechs e Fintechs são os principais candidatos a competir no mercado de AIS. Os bancos argumentam que a confiança dos consumidores os coloca numa posição favorável para dominar este mercado. No entanto, desde a crise financeira global, o nível de confiança nos bancos é discutível. O principal objetivo desta investigação é concluir se os bancos estão certos ao julgar a confiança dos consumidores como um fator positivo de diferenciação em relação às outras instituições. A utilização de uma fonte primária de dados permitiu que esta pesquisa comparasse diferentes pontuações entre instituições, não apenas no que se refere à confiança geral, mas também em cada determinante. Além disso, este estudo mediu a força das correlações entre os determinantes e as pontuações institucionais gerais, bem como entre a confiança do indivíduo no sistema financeiro e em cada instituição. Os resultados mostraram que não há indícios aparentes de que os bancos estejam prestes a ser derrubados como líderes do mercado AIS em Portugal devido aos níveis de confiança nos prestadores de serviços financeiros. No entanto, a forte associação dos bancos com o sistema por parte dos indivíduos, junto com a pontuação mais alta das Bigtechs em vários determinantes, e a oportunidade das Fintechs de sua desconexão de imagem com o sistema, podem fazer a mudança o paradigma em um futuro próximo

    Towards Data Privacy and Utility in the Applications of Graph Neural Networks

    Get PDF
    Graph Neural Networks (GNNs) are essential for handling graph-structured data, often containing sensitive information. It’s vital to maintain a balance between data privacy and usability. To address this, this dissertation introduces three studies aimed at enhancing privacy and utility in GNN applications, particularly in node classification, link prediction, and graph classification. The first work tackles celebrity privacy in social networks. We develop a novel framework using adversarial learning for link-privacy preserved graph embedding, which effectively safeguards sensitive links without compromising the graph’s structure and node attributes. This approach is validated using real social network data. In the second work, we confront challenges in federated graph learning with non-independent and identically distributed (non-IID) data. We introduce PPFL-GNN, a privacy-preserving federated graph neural network framework that mitigates overfitting on the client side and inefficient aggregation on the server side. It leverages local graph data for embeddings and employs embedding alignment techniques for enhanced privacy, addressing the hurdles in federated learning on non-IID graph data. The third work explores Few-Shot graph classification, which aims to classify novel graph types with limited labeled data. We propose a unique framework combining Meta-learning and contrastive learning to better utilize graph structures in molecular and social network datasets. Additionally, we offer benchmark graph datasets with extensive node-attribute dimensions for future research. These studies collectively advance the field of graph-based machine learning by addressing critical issues of data privacy and utility in GNN applications

    Social Media Analytics and Information Privacy Decisions: Impact of User Intimate Knowledge and Co-ownership Perceptions

    Get PDF
    Social media analytics has been recognized as a distinct research field in the analytics subdomain that is developed by processing social media content to generate important business knowledge. Understanding the factors that influence privacy decisions around its use is important as it is often perceived to be opaque and mismanaged. Social media users have been reported to have low intimate knowledge and co-ownership perception of social media analytics and its information privacy decisions. This deficiency leads them to perceive privacy violations if firms make privacy decisions that conflict with their expectations. Such perceived privacy violations often lead to business disruptions caused by user rebellions, regulatory interventions, firm reputation damage, and other business continuity threats. Existing research had developed theoretical frameworks for multi-level information privacy management and called for empirical testing of which constructs would increase user self-efficacy in negotiating with firms for joint social media analytics decision making. A response to this call was studied by measuring the constructs in the literature that lead to normative social media analytics and its information privacy decisions. The study model was developed by combining the relevant constructs from the theory of psychological ownership in organizations and the theory of multilevel information privacy. From psychological ownership theory, the impact that intimate knowledge had on co-ownership perception of social media analytics was added. From the theory of multi-level information privacy, the impact of co-ownership perception on the antecedents of information privacy decisions: the social identity assumed, and information privacy norms used were examined. In addition, the moderating role of the cost and benefits components of the privacy calculus on the relationship between information privacy norms and expected information privacy decisions was measured. A quantitative research approach was used to measure these factors. A web-based survey was developed using survey items obtained from prior studies that measured these constructs with only minor wording changes made. A pilot-study of 34 participants was conducted to test and finalize the instrument. The survey was distributed to adult social media users in the United States of America on a crowdsourcing marketplace using a commercial online survey service. 372 responses were accepted and analyzed. The partial least squares structural equation modeling method was used to assess the model and analyze the data using the Smart partial least squares 3 statistical software package. An increase in intimate knowledge of social media analytics led to higher co-ownership perception among social media users. Higher levels of co-ownership perception led to higher expectation of adoption of a salient social identity and higher expected information privacy norms. In addition, higher levels of expectation of social information privacy norm use led to normative privacy decisions. Higher levels of benefit estimation in the privacy calculus negatively moderated the relationship between social norms and privacy decision making. Co-ownership perception did not have a significant effect on the cost estimation in social media analytics privacy calculus. Similarly, the cost estimation in the privacy calculus did not have a significant effect on the relationship between information privacy norm adoption and the expectation of a normative information privacy decision. The findings of the study are a notable information systems literature contribution in both theory and practice. The study is one of the few to further develop multilevel information privacy theory by adding the intimate knowledge construct. The study model is a contribution to literature since its one of first to combine and validate elements of psychological ownership in organization theory to the theory of multilevel information privacy in order to understand what social media users expect when social media analytics information privacy decisions are made. The study also contributes by suggesting approaches practitioners can use to collaboratively manage their social media analytics information privacy decisions which was previously perceived to be opaque and under examined. Practical suggestions social media firms could use to decrease negative user affectations and engender deeper information privacy collaboration with users as they seek benefit from social media analytics were offered

    Privacy Preserving User Data Publication In Social Networks

    Get PDF
    Recent trends show that the popularity of Social Networks (SNs) has been increasing rapidly. From daily communication sites to online communities, an average person\u27s daily life has become dependent on these online networks. Additionally, the number of people using at least one of the social networks have increased drastically over the years. It is estimated that by the end of the year 2020, one-third of the world\u27s population will have social accounts. Hence, user privacy protection has gained wide acclaim in the research community. It has also become evident that protection should be provided to these networks from unwanted intruders. In this dissertation, we consider data privacy on online social networks at the network level and the user level. The network-level privacy helps us to prevent information leakage to third-party users like advertisers. To achieve such privacy, we propose various schemes that combine the privacy of all the elements of a social network: node, edge, and attribute privacy by clustering the users based on their attribute similarity. We combine the concepts of k-anonymity and l-diversity to achieve user privacy. To provide user-level privacy, we consider the scenario of mobile social networks as the user location privacy is the much-compromised problem. We provide a distributed solution where users in an area come together to achieve their desired privacy constraints. We also consider the mobility of the user and the network to provide much better results

    RANDOMIZATION BASED PRIVACY PRESERVING CATEGORICAL DATA ANALYSIS

    Get PDF
    The success of data mining relies on the availability of high quality data. To ensure quality data mining, effective information sharing between organizations becomes a vital requirement in today’s society. Since data mining often involves sensitive infor- mation of individuals, the public has expressed a deep concern about their privacy. Privacy-preserving data mining is a study of eliminating privacy threats while, at the same time, preserving useful information in the released data for data mining. This dissertation investigates data utility and privacy of randomization-based mod- els in privacy preserving data mining for categorical data. For the analysis of data utility in randomization model, we first investigate the accuracy analysis for associ- ation rule mining in market basket data. Then we propose a general framework to conduct theoretical analysis on how the randomization process affects the accuracy of various measures adopted in categorical data analysis. We also examine data utility when randomization mechanisms are not provided to data miners to achieve better privacy. We investigate how various objective associ- ation measures between two variables may be affected by randomization. We then extend it to multiple variables by examining the feasibility of hierarchical loglinear modeling. Our results provide a reference to data miners about what they can do and what they can not do with certainty upon randomized data directly without the knowledge about the original distribution of data and distortion information. Data privacy and data utility are commonly considered as a pair of conflicting re- quirements in privacy preserving data mining applications. In this dissertation, we investigate privacy issues in randomization models. In particular, we focus on the attribute disclosure under linking attack in data publishing. We propose efficient so- lutions to determine optimal distortion parameters such that we can maximize utility preservation while still satisfying privacy requirements. We compare our randomiza- tion approach with l-diversity and anatomy in terms of utility preservation (under the same privacy requirements) from three aspects (reconstructed distributions, accuracy of answering queries, and preservation of correlations). Our empirical results show that randomization incurs significantly smaller utility loss
    corecore