10 research outputs found

    All liaisons are dangerous when all your friends are known to us

    Get PDF
    Online Social Networks (OSNs) are used by millions of users worldwide. Academically speaking, there is little doubt about the usefulness of demographic studies conducted on OSNs and, hence, methods to label unknown users from small labeled samples are very useful. However, from the general public point of view, this can be a serious privacy concern. Thus, both topics are tackled in this paper: First, a new algorithm to perform user profiling in social networks is described, and its performance is reported and discussed. Secondly, the experiments --conducted on information usually considered sensitive-- reveal that by just publicizing one's contacts privacy is at risk and, thus, measures to minimize privacy leaks due to social graph data mining are outlined.Comment: 10 pages, 5 table

    Re-Identification Attacks – A Systematic Literature Review

    Get PDF
    The publication of increasing amounts of anonymised open source data has resulted in a worryingly rising number of successful re-identification attacks. This has a number of privacy and security implications both on an individual and corporate level. This paper uses a Systematic Literature Review to investigate the depth and extent of this problem as reported in peer reviewed literature. Using a detailed protocol ,seven research portals were explored, 10,873 database entries were searched, from which a subset of 220 papers were selected for further review. From this total, 55 papers were selected as being within scope and to be included in the final review. The main review findings are that 72.7% of all successful re-identification attacks have taken place since 2009. Most attacks use multiple datasets. The majority of them have taken place on global datasets such as social networking data, and have been conducted by US based researchers. Furthermore, the number of datasets can be used as an attribute. Because privacy breaches have security, policy and legal implications (e.g. data protection, Safe Harbor etc.), the work highlights the need for new and improved anonymisation techniques or indeed, a fresh approach to open source publishing

    Online Privacy as a Collective Phenomenon

    Full text link
    The problem of online privacy is often reduced to individual decisions to hide or reveal personal information in online social networks (OSNs). However, with the increasing use of OSNs, it becomes more important to understand the role of the social network in disclosing personal information that a user has not revealed voluntarily: How much of our private information do our friends disclose about us, and how much of our privacy is lost simply because of online social interaction? Without strong technical effort, an OSN may be able to exploit the assortativity of human private features, this way constructing shadow profiles with information that users chose not to share. Furthermore, because many users share their phone and email contact lists, this allows an OSN to create full shadow profiles for people who do not even have an account for this OSN. We empirically test the feasibility of constructing shadow profiles of sexual orientation for users and non-users, using data from more than 3 Million accounts of a single OSN. We quantify a lower bound for the predictive power derived from the social network of a user, to demonstrate how the predictability of sexual orientation increases with the size of this network and the tendency to share personal information. This allows us to define a privacy leak factor that links individual privacy loss with the decision of other individuals to disclose information. Our statistical analysis reveals that some individuals are at a higher risk of privacy loss, as prediction accuracy increases for users with a larger and more homogeneous first- and second-order neighborhood of their social network. While we do not provide evidence that shadow profiles exist at all, our results show that disclosing of private information is not restricted to an individual choice, but becomes a collective decision that has implications for policy and privacy regulation

    Elecciones Europeas 2014 : viralidad de los mensajes en Twitter

    Get PDF
    Desde las elecciones catalanas de 2010 en que se alcanzó una masa crítica de usuarios españoles en Twitter, esta red social ha jugado un importante papel en la difusión de mensajes en todas las campañas electorales celebradas hasta la fecha en España. El objetivo de esta investigación es aportar luz sobre la participación y la receptividad del ciudadano a los mensajes emitidos en Twitter durante la campaña a las elecciones europeas de 2014. Se ha estudiado la conversación exógena a la organización de los partidos y candidatos al Parlamento Europeo junto con la endógena de las fuerzas políticas contendientes. En ambos casos se han analizado los patrones de publicación, los temas tratados, la difusión de estos mensajes y los perfiles de los usuarios que participaron. Se ha encontrado que el contexto endógeno obedeció a un plan de comunicación con una participación activa y centrada en temas sobre la campaña, el sistema electoral y la corrupción, siendo los líderes de opinión los candidatos, los partidos y en menor medida los políticos. En contraposición, el contexto exógeno fue espontáneo, con poca participación salvo ante los debates o las controversias de los candidatos, más sensible a las polémicas y más activo en las descalificaciones, siendo liderado por la ciudadanía. Los medios de comunicación y los periodistas no lideraron ninguno de los entornos estudiados.Since Catalan elections in 2010, when a critical mass of Spanish users was reached on Twitter, this social network has played an important role in disseminating messages in all electoral campaigns held in Spain so far. The objective of this research is to shed light on the participation and receptivity of Spanish citizens in Twitter during the campaign for the European elections in 2014. We have studied the exogenous conversation about parties and candidates to the European Parliament and the endogenous conversation among political forces. In both cases we have analyzed the patterns of publication, the issues, the messages spread and the profiles of the users who participated. Endogenous context followed a communication plan with an active participation and focused on issues about the campaign, the electoral system and corruption. Opinion leaders were candidates, parties and some politicians. In contrast, exogenous context was spontaneous, with a passive participation except debates and controversy about the candidates, more sensitive to the controversial and most active in disqualifications. Opinion leaders were the citizenry. The Media and journalists did not lead any of the environments studied

    Event-Based User Classification in Weibo Media

    Get PDF

    Exploiting Innocuous Activity for Correlating Users Across Sites

    Get PDF
    International audienceWe study how potential attackers can identify accounts on different social network sites that all belong to the same user, exploiting only innocuous activity that inherently comes with posted content. We examine three specific features on Yelp, Flickr, and Twitter: the geo-location attached to a user's posts, the timestamp of posts, and the user's writing style as captured by language models. We show that among these three features the location of posts is the most powerful feature to identify accounts that belong to the same user in different sites. When we combine all three features, the accuracy of identifying Twitter accounts that belong to a set of Flickr users is comparable to that of existing attacks that exploit usernames. Our attack can identify 37% more accounts than using usernames when we instead correlate Yelp and Twitter. Our results have significant privacy implications as they present a novel class of attacks that exploit users' tendency to assume that, if they maintain different personas with different names, the accounts cannot be linked together; whereas we show that the posts themselves can provide enough information to correlate the accounts

    Caracterización de usuarios y propagación de mensajes en Twitter en el entorno de temas sociales

    Get PDF
    La Web, que nació bajo el espíritu de la colaboración y la libertad de información frente al modelo vigente de competitividad y derechos de propiedad, ha ido evolucionando hasta nuestros días de una manera que nadie podría haber supuesto. Actualmente, la sociedad está fuertemente conectada y comparte una gran cantidad de información pero no lo hace de una forma distribuida, como se esperaba, sino centralizada desde plataformas conocidas como redes sociales. El lado positivo de esta concentración es la facilidad para obtener los datos de interacción social. De todas las redes sociales, Twitter se ha caracterizado por su carácter abierto tanto en sus contenidos como en el acceso a sus datos mediante APIs, y aunque el caudal completo de sus datos no está accesible de forma gratuita, es hoy por hoy la fuente más importante de datos sociales de la que disponen los investigadores en Internet. Esta tesis aborda el análisis de la propagación de mensajes en Twitter en temas sociales y el papel que desempeñan las personas en la difusión. El enfoque se realiza desde un análisis empírico a través de un conjunto de casos de estudio con diferentes dimensiones, duraciones y contextos. Para poder abordar esta investigación he diseñado la plataforma T-hoarder que captura los mensajes que publican los usuarios de Twitter, los analiza y visualiza los resultados, permitiendo detectar los momentos más virales y los usuarios más destacados. Esta plataforma dispone de mecanismos de procesado por partes y su posterior integración, gracias a los cuales ha funcionado continuamente durante más de cuatro años sin problemas de escalabilidad. Desde ella he podido observar más de cuarenta casos relacionados con los acontecimientos de impacto social, los movimientos sociales, las elecciones en España, las tendencias en Twitter y la relación entre Twitter y Televisión. Basándome en las observaciones en sucesivos experimentos y mediante un proceso de refinamiento, he establecido la clasificación de usuarios que se presenta en esta tesis. Esta clasificación se valida con distintas métricas en las que la agrupación de los tipos de usuarios es coherente. Por otro lado, he definido los atributos de Alcance, Difusión, Participación, Incorporación y Automatismo y los he medido cada hora para cada uno de los casos. Las correlaciones encontradas para estos atributos, salvo el Automatismo, respecto al número de tuits publicados en cada intervalo de tiempo son muy altas en la mayoría de los casos. Macroscópicamente he encontrado una burbuja de actividad en todos los casos en la que el 80% de los mensajes difundidos fueron publicados por una minoría y los causantes del 80% de la propagación formaron grupos reducidos de usuarios. Analizados año a año los casos de estudio de duración superior a los dos años he descubierto que cada año va aumentando el porcentaje de retransmisiones mientras que el tamaño de los grupos que las producen disminuye. Un rasgo de meritocracia descubierto ha sido que la capacidad de propagación de mensajes de un usuario no depende la estructura de su red.The Internet, which was born in the spirit of collaboration and freedom of information in the presence of the prevailing model of competitivity and copyright has evolved until now in a way which nobody could have imagined. Nowadays we are all very connected and we share a lot of information but we do not do it in a distributed way, as may have been imagined, but in a more centralised way from platforms we call Social Media. The positive part of this concentration is the ease with which social interaction data can be gathered. Of all the Social Media, Twitter stands out for its openness both in content and in accessibility of data through API's and although the complete flow of data is no available freely, it is today, the most important source of social data that researchers have on the Internet. This thesis tackles the analysis of propagation of messages on Twitter about social issues and the role carried out by people in the spread. The focus was done by experiential analysis through eighteen case studies of different dimensions, durations and contexts. In order to carry out this research I have designed the T- hoarder platform which captures messages posted by Twitter users, analyses and visualises the results, allowing the detection of the most viral moments and the most prominent users. This platform has process mechanisms, and its later integration, thanks to which it has functioned continually for over four years without any scale problems. From it, over forty cases related to social impact, social movements, the elections in Spain, trends on Twitter and the relation between Twitter and television. Based on observations in successive experiments and through a process of refinement, I have established the classification of users which is presented in this thesis. This classification was validated using distinctive metrics in which the grouping of types of user is coherent. On the other hand, I have defined the attributes of Reach, Diffusion, Participation, Incorporation and Automation and I have measured them every hour in each of the cases. The correlations found for these attributes, except for the Automation, with respect to the number of tweets posted at each interval of time are very high in the majority of cases. Macroscopically I have found a bubble of activity in all the cases in which 80% of the messages spread were posted by a minority and the source of 80% of the spread formed small groups of users. Analyzing year on year the cases studied for a duration of over two years I have discovered that each year the percentage of retweets is increasing while the size of the groups producing them is falling. A feature of meritocracy uncovered has been that the capacity of spread of messages of a user is not related to his/her network.Programa Oficial de Doctorado en Ingeniería TelemáticaPresidente: Carlos García Rubio.- Presidente: Rafael Rubio Núñez.- Vocal: Daniel Gayo Avel

    Exploiting Human Factors in User Authentication

    Get PDF
    Our overarching issue in security is the human factor—and dealing with it is perhaps one of the biggest challenges we face today. Human factor is often described as the weakest part of a security system and users are often described as the weakest link in the security chain. In this thesis, we focus on two problems which are caused by human factors in user authentication and propose respective solutions. a) Secrecy information inference attack—publicly available information can be used to infer some secrecy information about the user. b) Coercion attack—where an attacker forces a user to handover his/her secret information such as account details and password. In the secrecy information inference attack, an attacker can use publicly available data to infer secrecy information about a victim. We should be prudent in choosing any information as secrecy information in user authentication. In this work, we exploit public data extracted from Facebook to infer users' interests. Such interests can also found on their profile pages but such pages are often private. Our experiments conducted on over more than 34, 000 public pages collected from Facebook show that our inference technique can infer interests which are often hidden by users with moderate accuracy. Using the inferred interests, we also demonstrate a secrecy information inference attack to break a preference based backup authentication system BlueMoon™. To mitigate the effect of secrecy information inference attack, we propose a new authentication mechanism based on user's cellphone usage data which is often private. The system generates memorable and dynamic fingerprints which can be used to create authentication challenges. In particular, in this work, we explore if the generated behavioral fingerprints are memorable enough to be remembered by end users to be used for authentication credentials. We demonstrate the application of memorable fingerprints by designing an authentication application on top of it. We conducted an extensive user study that involved collecting about one month of continuous usage data from 58 Symbian and Android smartphone users. Results show that the fingerprints generated are remembered by the user to some extent and that they were moderately secure against attacks even by family members and close friends. The second problem which we focus in this thesis is human vulnerability to coercion attacks. In such attacks, the user is forcefully asked by an attacker to reveal the secret/key to gain access to the system. Most authentication mechanisms today are vulnerable to coercion attacks. We present a novel approach in generating cryptographic keys to fight against coercion attacks. Our technique incorporates a measure of user's emotional status using skin conductance (which changes when the user is under coercion) into the key generation process. A preliminary user study with 39 subjects was conducted which shows that our approach has moderate false acceptance and false rejection rates. Furthermore, to meet the demand of scalability and usability, many real-world authentication systems have adopted the idea of responsibility shifting, where a user's responsibility of authentication is shifted to another entity, usually in case of failure of the primary authentication method. In a responsibility shifting authentication scenario, a human helper who is involved in regaining access, is vulnerable to coercion attacks. In this work, we report our user study on 29 participants which investigates the helper's emotional status when being coerced to assist in an attack. Results show that the coercion causes involuntary skin conductance fluctuation on the helper, which indicates that he/she is nervous and stressed. The results from the two studies show that the skin conductance is a viable approach to fight against coercion attacks in user authentication
    corecore