Search CORE

490 research outputs found

Exploiting Innocuous Activity for Correlating Users Across Sites

Author: Friedland Gerald
Goga Oana
Lei Howard
Parthasarathi Sree Hari Krishnan
Sommer Robin
Teixeira Renata
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2013
Field of study

International audienceWe study how potential attackers can identify accounts on different social network sites that all belong to the same user, exploiting only innocuous activity that inherently comes with posted content. We examine three speciﬁc features on Yelp, Flickr, and Twitter: the geo-location attached to a user's posts, the timestamp of posts, and the user's writing style as captured by language models. We show that among these three features the location of posts is the most powerful feature to identify accounts that belong to the same user in different sites. When we combine all three features, the accuracy of identifying Twitter accounts that belong to a set of Flickr users is comparable to that of existing attacks that exploit usernames. Our attack can identify 37% more accounts than using usernames when we instead correlate Yelp and Twitter. Our results have signiﬁcant privacy implications as they present a novel class of attacks that exploit users' tendency to assume that, if they maintain different personas with different names, the accounts cannot be linked together; whereas we show that the posts themselves can provide enough information to correlate the accounts

CiteSeerX

Crossref

INRIA a CCSD electronic archive server

Web Data Extraction, Applications and Techniques: A Survey

Author: Abel
Amalfitano
Balduzzi
Baumgartner
Baumgartner
Baumgartner
Baumgartner
Baumgartner
Baumgartner
Berger
Berthold
Bettencourt
Califf
Catanese
Chang
Chen
Chen
Chen
Collins
Conover
Crandall
Crescenzi
Crescenzi
Dalvi
Dalvi
De Meo
De Meo
Doan
Emilio Ferrara
Ferrara
Ferrara
Ferrara
Ferrara
Ferrara
Flesca
Freitag
Furche
Gatterbauer
Gatterbauer
Giacomo Fiumara
Gjoka
Gkotsis
Gottlob
Gottlob
Hammersley
Han
Hecht
Hsu
Irmak
Khare
Kim
Kinsella
Kleinberg
Kleinberg
Kohlschütter
Kokkoras
Kokkoras
Kokkoras
Krüpl
Kushmerick
Kwak
Laender
Liu
Manning
Masanès
Mathes
Meng
Mislove
Monge
Muslea
Oro
Pan
Pasquale De Meo
Perito
Phan
Plake
Rahm
Rahm
Reis
Robert Baumgartner
Sahuguet
Sarawagi
Schifanella
Selkow
Shi
Soderland
Szomszor
Turmo
Vosecky
Wang
Wang
Weikum
Wilson
Winograd
Yang
Ye
Zafarani
Zanasi
Zhai
Zhang
Zhang
Publication venue: 'Elsevier BV'
Publication date: 09/06/2014
Field of study

Web Data Extraction is an important problem that has been studied by means of different scientific tools and in a broad range of applications. Many approaches to extracting data from the Web have been designed to solve specific problems and operate in ad-hoc domains. Other approaches, instead, heavily reuse techniques and algorithms developed in the field of Information Extraction. This survey aims at providing a structured and comprehensive overview of the literature in the field of Web Data Extraction. We provided a simple classification framework in which existing Web Data Extraction applications are grouped into two main classes, namely applications at the Enterprise level and at the Social Web level. At the Enterprise level, Web Data Extraction techniques emerge as a key tool to perform data analysis in Business and Competitive Intelligence systems as well as for business process re-engineering. At the Social Web level, Web Data Extraction techniques allow to gather a large amount of structured data continuously generated and disseminated by Web 2.0, Social Media and Online Social Network users and this offers unprecedented opportunities to analyze human behavior at a very large scale. We discuss also the potential of cross-fertilization, i.e., on the possibility of re-using Web Data Extraction techniques originally designed to work in a given domain, in other domains.Comment: Knowledge-based System

arXiv.org e-Print Archive

Crossref

Re-Identification Attacks – A Systematic Literature Review

Author: Henriksen-Bulmer Jane
Jeary Sherry
Publication venue: 'Elsevier BV'
Publication date: 01/12/2016
Field of study

The publication of increasing amounts of anonymised open source data has resulted in a worryingly rising number of successful re-identification attacks. This has a number of privacy and security implications both on an individual and corporate level. This paper uses a Systematic Literature Review to investigate the depth and extent of this problem as reported in peer reviewed literature. Using a detailed protocol ,seven research portals were explored, 10,873 database entries were searched, from which a subset of 220 papers were selected for further review. From this total, 55 papers were selected as being within scope and to be included in the final review. The main review findings are that 72.7% of all successful re-identification attacks have taken place since 2009. Most attacks use multiple datasets. The majority of them have taken place on global datasets such as social networking data, and have been conducted by US based researchers. Furthermore, the number of datasets can be used as an attribute. Because privacy breaches have security, policy and legal implications (e.g. data protection, Safe Harbor etc.), the work highlights the need for new and improved anonymisation techniques or indeed, a fresh approach to open source publishing

Crossref

Bournemouth University Research Online

An Empirical Study on Android for Saving Non-shared Data on Public Storage

Author: Diao Wenrui
Li Zhou
Liu Xiangyu
Zhang Kehuan
Zhou Zhe
Publication venue
Publication date: 21/07/2014
Field of study

With millions of apps that can be downloaded from official or third-party market, Android has become one of the most popular mobile platforms today. These apps help people in all kinds of ways and thus have access to lots of user's data that in general fall into three categories: sensitive data, data to be shared with other apps, and non-sensitive data not to be shared with others. For the first and second type of data, Android has provided very good storage models: an app's private sensitive data are saved to its private folder that can only be access by the app itself, and the data to be shared are saved to public storage (either the external SD card or the emulated SD card area on internal FLASH memory). But for the last type, i.e., an app's non-sensitive and non-shared data, there is a big problem in Android's current storage model which essentially encourages an app to save its non-sensitive data to shared public storage that can be accessed by other apps. At first glance, it seems no problem to do so, as those data are non-sensitive after all, but it implicitly assumes that app developers could correctly identify all sensitive data and prevent all possible information leakage from private-but-non-sensitive data. In this paper, we will demonstrate that this is an invalid assumption with a thorough survey on information leaks of those apps that had followed Android's recommended storage model for non-sensitive data. Our studies showed that highly sensitive information from billions of users can be easily hacked by exploiting the mentioned problematic storage model. Although our empirical studies are based on a limited set of apps, the identified problems are never isolated or accidental bugs of those apps being investigated. On the contrary, the problem is rooted from the vulnerable storage model recommended by Android. To mitigate the threat, we also propose a defense framework

arXiv.org e-Print Archive

Crossref

eScholarship - University of California

Potential mass surveillance and privacy violations in proximity-based social applications

Author: Forné Jordi
Puglisi Silvia
Rebollo-Monedero David
Publication venue
Publication date: 01/01/2016
Field of study

Proximity-based social applications let users interact with people that are currently close to them, by revealing some information about their preferences and whereabouts. This information is acquired through passive geo-localisation and used to build a sense of serendipitous discovery of people, places and interests. Unfortunately, while this class of applications opens different interactions possibilities for people in urban settings, obtaining access to certain identity information could lead a possible privacy attacker to identify and follow a user in their movements in a specific period of time. The same information shared through the platform could also help an attacker to link the victim's online profiles to physical identities. We analyse a set of popular dating application that shares users relative distances within a certain radius and show how, by using the information shared on these platforms, it is possible to formalise a multilateration attack, able to identify the user actual position. The same attack can also be used to follow a user in all their movements within a certain period of time, therefore identifying their habits and Points of Interest across the city. Furthermore we introduce a social attack which uses common Facebook likes to profile a person and finally identify their real identity

arXiv.org e-Print Archive

UPCommons. Portal del coneixement obert de la UPC

Corrélation des profils d'utilisateurs dans les réseaux sociaux : méthodes et applications

Author: Goga Oana
Publication venue: HAL CCSD
Publication date: 21/05/2014
Field of study

The proliferation of social networks and all the personal data that people share brings many opportunities for developing exciting new applications. At the same time, however, the availability of vast amounts of personal data raises privacy and security concerns.In this thesis, we develop methods to identify the social networks accounts of a given user. We first study how we can exploit the public profiles users maintain in different social networks to match their accounts. We identify four important properties – Availability, Consistency, non- Impersonability, and Discriminability (ACID) – to evaluate the quality of different profile attributes to match accounts. Exploiting public profiles has a good potential to match accounts because a large number of users have the same names and other personal infor- mation across different social networks. Yet, it remains challenging to achieve practically useful accuracy of matching due to the scale of real social networks. To demonstrate that matching accounts in real social networks is feasible and reliable enough to be used in practice, we focus on designing matching schemes that achieve low error rates even when applied in large-scale networks with hundreds of millions of users. Then, we show that we can still match accounts across social networks even if we only exploit what users post, i.e., their activity on a social networks. This demonstrates that, even if users are privacy conscious and maintain distinct profiles on different social networks, we can still potentially match their accounts. Finally, we show that, by identifying accounts that correspond to the same person inside a social network, we can detect impersonators.La prolifération des réseaux sociaux et des données à caractère personnel apporte de nombreuses possibilités de développement de nouvelles applications. Au même temps, la disponibilité de grandes quantités de données à caractère personnel soulève des problèmes de confidentialité et de sécurité. Dans cette thèse, nous développons des méthodes pour identifier les différents comptes d'un utilisateur dans des réseaux sociaux. Nous étudions d'abord comment nous pouvons exploiter les profils publics maintenus par les utilisateurs pour corréler leurs comptes. Nous identifions quatre propriétés importantes - la disponibilité, la cohérence, la non-impersonabilite, et la discriminabilité (ACID) - pour évaluer la qualité de différents attributs pour corréler des comptes. On peut corréler un grand nombre de comptes parce-que les utilisateurs maintiennent les mêmes noms et d'autres informations personnelles à travers des différents réseaux sociaux. Pourtant, il reste difficile d'obtenir une précision suffisant pour utiliser les corrélations dans la pratique à cause de la grandeur de réseaux sociaux réels. Nous développons des schémas qui obtiennent des faible taux d'erreur même lorsqu'elles sont appliquées dans les réseaux avec des millions d'utilisateurs. Ensuite, nous montrons que nous pouvons corréler les comptes d'utilisateurs même si nous exploitons que leur activité sur un les réseaux sociaux. Ça sa démontre que, même si les utilisateurs maintient des profils distincts nous pouvons toutefois corréler leurs comptes. Enfin, nous montrons que, en identifiant les comptes qui correspondent à la même personne à l'intérieur d'un réseau social, nous pouvons détecter des imitateurs

Thèses en Ligne

Theses.fr

On the anonymity risk of time-varying user profiles.

Author: Forné Muñoz Jorge
Puglisi Silvia
Rebollo Monedero David
Publication venue: 'MDPI AG'
Publication date: 01/01/2017
Field of study

Websites and applications use personalisation services to profile their users, collect their patterns and activities and eventually use this data to provide tailored suggestions. User preferences and social interactions are therefore aggregated and analysed. Every time a user publishes a new post or creates a link with another entity, either another user, or some online resource, new information is added to the user profile. Exposing private data does not only reveal information about single users’ preferences, increasing their privacy risk, but can expose more about their network that single actors intended. This mechanism is self-evident in social networks where users receive suggestions based on their friends’ activities. We propose an information-theoretic approach to measure the differential update of the anonymity risk of time-varying user profiles. This expresses how privacy is affected when new content is posted and how much third-party services get to know about the users when a new activity is shared. We use actual Facebook data to show how our model can be applied to a real-world scenario.Peer ReviewedPostprint (published version

Multidisciplinary Digital Publishing Institute

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

UPCommons. Portal del coneixement obert de la UPC

Directory of Open Access Journals