222 research outputs found

    Protecting attributes and contents in online social networks

    Get PDF
    With the extreme popularity of online social networks, security and privacy issues become critical. In particular, it is important to protect user privacy without preventing them from normal socialization. User privacy in the context of data publishing and structural re-identification attacks has been well studied. However, protection of attributes and data content was mostly neglected in the research community. While social network data is rarely published, billions of messages are shared in various social networks on a daily basis. Therefore, it is more important to protect attributes and textual content in social networks. We first study the vulnerabilities of user attributes and contents, in particular, the identifiability of the users when the adversary learns a small piece of information about the target. We have presented two attribute-reidentification attacks that exploit information retrieval and web search techniques. We have shown that large portions of users with online presence are very identifiable, even with a small piece of seed information, and the seed information could be inaccurate. To protect user attributes and content, we adopt the social circle model derived from the concepts of "privacy as user perception" and "information boundary". Users will have different social circles, and share different information in different circles. We introduce a social circle discovery approach using multi-view clustering. We present our observations on the key features of social circles, including friendship links, content similarity and social interactions. We treat each feature as one view, and propose a one-side co-trained spectral clustering technique, which is tailored for the sparse nature of our data. We also propose two evaluation measurements. One is based on the quantitative measure of similarity ratio, while the other employs human evaluators to examine pairs of users, who are selected by the max-risk active evaluation approach. We evaluate our approach on ego networks of twitter users, and present our clustering results. We also compare our proposed clustering technique with single-view clustering and original co-trained spectral clustering techniques. Our results show that multi-view clustering is more accurate for social circle detection; and our proposed approach gains significantly higher similarity ratio than the original multi-view clustering approach. In addition, we build a proof-of-concept implementation of automatic circle detection and recommendation methods. For a user, the system will return its circle detection result from our proposed multi-view clustering technique, and the key words for each circle are also presented. Users can also enter a message they want to post, and the system will suggest which circle to disseminate the message

    The structure and dynamics of multilayer networks

    Get PDF
    In the past years, network theory has successfully characterized the interaction among the constituents of a variety of complex systems, ranging from biological to technological, and social systems. However, up until recently, attention was almost exclusively given to networks in which all components were treated on equivalent footing, while neglecting all the extra information about the temporal- or context-related properties of the interactions under study. Only in the last years, taking advantage of the enhanced resolution in real data sets, network scientists have directed their interest to the multiplex character of real-world systems, and explicitly considered the time-varying and multilayer nature of networks. We offer here a comprehensive review on both structural and dynamical organization of graphs made of diverse relationships (layers) between its constituents, and cover several relevant issues, from a full redefinition of the basic structural measures, to understanding how the multilayer nature of the network affects processes and dynamics.Comment: In Press, Accepted Manuscript, Physics Reports 201

    Improving Search Engine Results by Query Extension and Categorization

    Get PDF
    Since its emergence, the Internet has changed the way in which information is distributed and it has strongly influenced how people communicate. Nowadays, Web search engines are widely used to locate information on the Web, and online social networks have become pervasive platforms of communication. Retrieving relevant Web pages in response to a query is not an easy task for Web search engines due to the enormous corpus of data that the Web stores and the inherent ambiguity of search queries. We present two approaches to improve the effectiveness of Web search engines. The first approach allows us to retrieve more Web pages relevant to a user\u27s query by extending the query to include synonyms and other variations. The second, gives us the ability to retrieve Web pages that more precisely reflect the user\u27s intentions by filtering out those pages which are not related to the user-specified interests. Discovering communities in online social networks (OSNs) has attracted much attention in recent years. We introduce the concept of subject-driven communities and propose to discover such communities by modeling a community using a posting/commenting interaction graph which is relevant to a given subject of interest, and then applying link analysis on the interaction graph to locate the core members of a community

    Scalable Algorithms for Community Detection in Very Large Graphs

    Get PDF

    Ego-centred models of social networks: the social atom

    Get PDF
    MenciĂłn Internacional en el tĂ­tulo de doctorThis thesis set out to contribute to the realm of social physics, with a particular focus on human social networks. Our approach, however, is somewhat di erent from what is typical in disciplines such as complex systems or statistical physics. Rather than simplifying the features of the constituents of our system (people), and stressing their rules of interaction, we focus on better understanding those very same constituents, modelling them as social atoms. Our rationale is that a better understanding of such an atom may shed light on how (and why) it interacts with other atoms to form social collectives. Given its robustness and the evolutionary roots of its premises, we use the Social Brain Hypothesis as our departure point. This theory states that the evolutionary drive behind the development of large brains in humans was the need to process social information and that the limited capacity of our brains imposes a limit to the number of relationships we can manage— the so-called “Dunbar’s number”, roughly 150. Moreover, evidence keeps revealing that these relationships are further organised in a series of hierarchically inclusive layers with decreasing emotional intensity, whose sizes exhibit a more or less constant scaling. Notwithstanding the empirical evidence, neither the presence of scaling in the organisation of personal networks nor its connection with limited cognitive skills had been explained so far. In Chapter 2 we present a mathematical model that solves this puzzle. The assumptions of the model are quite simple, and well founded on empirical evidence. Firstly, the number of relationships we maintain tends to be stable on average. Secondly, these relationships are costly, and our resources are limited. With these two premises, our results show that the hierarchical organisation emerges naturally from the principle of maximum entropy. Not only that, but we also predict a hitherto unnoticed regime of organisation whose existence we prove using several datasets from communities of immigrants. The former model considers that relationships can only belong to a discrete set of categories (layers). In Chapter 3 we extend it so that relationships are classified in a continuum. This modification allows us to test the model with data from very di erent sources such as online communications, face-to-face contacts, and phone calls. Our results show that the two regimes of organisation found in the previous model persist in this variant, and reveal the underlying existence of a (universal) scaling parameter which does not depend on any particular number of layers. To incorporate these ideas into socio-centric models, we build on the so-called Structural Balance Theory. This theory, underpinned by psychological motivations, posits that the structure of social networks of positive and negative relationships are highly interdependent. However, the theory has received little empirical validation, and negative social relationships are poorly understood—both from an ego-centric and a socio-centric perspective. For that reason, we turn to developing an experimental software in order to gather data within a school. In Chapters 4 and 5 we present results from these experiments. In Chapter 4 we analyse the socio-centric networks using machine learning techniques and find that the structure of positive and negative networks is indeed very much connected. Besides, we study the two types of networks separately, showing that they exhibit quite distinct features and that gender e ects in negative social networks are weak and asymmetrical for boys and girls. In Chapter 5, on the other hand, we focus on the structure of negative personal networks. Remarkably, using data from two di erent experimental settings, we show that the structure of personal networks of negative relationships mirrors that of the positive ones and exhibits a similar scaling—albeit their size is significantly smaller. Chapter 6 summarises our results and presents future (and current) lines of investigation. Among them, we outline a model of a social fluid that uses the insights gained with this thesis to build a model of social collectives as ensembles of personal networks. This model is compatible, at the micro-level, with the observations of the social brain hypothesis, and, at the macro-level, with the premises of the structural balance theory.This thesis would not have been possible without the support of FundaciĂłn BBVA through its 2016 call project ”Los nĂșmeros de Dunbar y la estructura de las sociedades digitales: modelizaciĂłn y simulaciĂłn (DUNDIG)”, and we are very thankful for it. Support for early stages of this work through projects IBSEN (European Commission, H2020 FET Open RIA 662725) and VARIANCE (Ministerio de EconomĂ­a y Competitividad/FEDER, project no. FIS2015-64349-P) is also acknowledgedPrograma Oficial de Doctorado en IngenierĂ­a MatemĂĄtica por la Universidad Carlos III de MadridPresidente: Javier MartĂ­n BuldĂș.- Secretario: JosĂ© Luis Molina GonzĂĄlez.- Vocal: Roberta Sinatr

    Statistical Analysis of Networks

    Get PDF
    This book is a general introduction to the statistical analysis of networks, and can serve both as a research monograph and as a textbook. Numerous fundamental tools and concepts needed for the analysis of networks are presented, such as network modeling, community detection, graph-based semi-supervised learning and sampling in networks. The description of these concepts is self-contained, with both theoretical justifications and applications provided for the presented algorithms. Researchers, including postgraduate students, working in the area of network science, complex network analysis, or social network analysis, will find up-to-date statistical methods relevant to their research tasks. This book can also serve as textbook material for courses related to the statistical approach to the analysis of complex networks. In general, the chapters are fairly independent and self-supporting, and the book could be used for course composition “à la carte”. Nevertheless, Chapter 2 is needed to a certain degree for all parts of the book. It is also recommended to read Chapter 4 before reading Chapters 5 and 6, but this is not absolutely necessary. Reading Chapter 3 can also be helpful before reading Chapters 5 and 7. As prerequisites for reading this book, a basic knowledge in probability, linear algebra and elementary notions of graph theory is advised. Appendices describing required notions from the above mentioned disciplines have been added to help readers gain further understanding

    Influence maximization under limited network information: Seeding high-degree neighbors

    Get PDF
    The diffusion of information, norms, and practices across a social network can be initiated by compelling a small number of seed individuals to adopt first. Strategies proposed in previous work either assume full network information or a large degree of control over what information is collected. However, privacy settings on the Internet and high non-response in surveys often severely limit available connectivity information. Here we propose a seeding strategy for scenarios with limited network information: Only the degrees and connections of some random nodes are known. This new strategy is a modification of ‘random neighbor sampling’ (or ‘one-hop’) and seeds the highest-degree neighbors of randomly selected nodes. Simulating a fractional threshold model, we find that this new strategy excels in networks with heavy tailed degree distributions such as scale-free networks and large online social networks. It outperforms the conventional one-hop strategy even though the latter can seed 50% more nodes, and other seeding possibilities including pure high-degree seeding and clustered seeding

    Influence maximization under limited network information : seeding high-degree neighbors

    Get PDF
    Published online: 28 October 2022The diffusion of information, norms, and practices across a social network can be initiated by compelling a small number of seed individuals to adopt first. Strategies proposed in previous work either assume full network information or a large degree of control over what information is collected. However, privacy settings on the Internet and high non-response in surveys often severely limit available connectivity information. Here we propose a seeding strategy for scenarios with limited network information: Only the degrees and connections of some random nodes are known. This new strategy is a modification of 'random neighbor sampling' (or 'one-hop') and seeds the highest-degree neighbors of randomly selected nodes. Simulating a fractional threshold model, we find that this new strategy excels in networks with heavy tailed degree distributions such as scale-free networks and large online social networks. It outperforms the conventional one-hop strategy even though the latter can seed 50% more nodes, and other seeding possibilities including pure high-degree seeding and clustered seeding

    Dynamic Credibility Threshold Assignment in Trust and Reputation Mechanisms Using PID Controller

    Get PDF
    In online shopping buyers do not have enough information about sellers and cannot inspect the products before purchasing them. To help buyers find reliable sellers, online marketplaces deploy Trust and Reputation Management (TRM) systems. These systems aggregate buyers’ feedback about the sellers they have interacted with and about the products they have purchased, to inform users within the marketplace about the sellers and products before making purchases. Thus positive customer feedback has become a valuable asset for each seller in order to attract more business. This naturally creates incentives for cheating, in terms of introducing fake positive feedback. Therefore, an important responsibility of TRM systems is to aid buyers find genuine feedback (reviews) about different sellers. Recent TRM systems achieve this goal by selecting and assigning credible advisers to any new customer/buyer. These advisers are selected among the buyers who have had experience with a number of sellers and have provided feedback for their services and goods. As people differ in their tastes, the buyer feedback that would be most useful should come from advisers with similar tastes and values. In addition, the advisers should be honest, i.e. provide truthful reviews and ratings, and not malicious, i.e. not collude with sellers to favour them or with other buyers to badmouth some sellers. Defining the boundary between dishonest and honest advisers is very important. However, currently, there is no systematic approach for setting the honesty threshold which divides benevolent advisers from the malicious ones. The thesis addresses this problem and proposes a market-adaptive honesty threshold management mechanism. In this mechanism the TRM system forms a feedback system which monitors the current status of the e-marketplace. According to the status of the e-marketplace the feedback system improves the performance utilizing PID controller from the field of control systems. The responsibility of this controller is to set the the suitable value of honesty threshold. The results of experiments, using simulation and real-world dataset show that the market-adaptive honesty threshold allows to optimize the performance of the marketplace with respect to throughput and buyer satisfaction
    • 

    corecore