222 research outputs found
Protecting attributes and contents in online social networks
With the extreme popularity of online social networks, security and privacy issues become critical. In particular, it is important to protect user privacy without preventing them from normal socialization. User privacy in the context of data publishing and structural re-identification attacks has been well studied. However, protection of attributes and data content was mostly neglected in the research community. While social network data is rarely published, billions of messages are shared in various social networks on a daily basis. Therefore, it is more important to protect attributes and textual content in social networks. We first study the vulnerabilities of user attributes and contents, in particular, the identifiability of the users when the adversary learns a small piece of information about the target. We have presented two attribute-reidentification attacks that exploit information retrieval and web search techniques. We have shown that large portions of users with online presence are very identifiable, even with a small piece of seed information, and the seed information could be inaccurate. To protect user attributes and content, we adopt the social circle model derived from the concepts of "privacy as user perception" and "information boundary". Users will have different social circles, and share different information in different circles. We introduce a social circle discovery approach using multi-view clustering. We present our observations on the key features of social circles, including friendship links, content similarity and social interactions. We treat each feature as one view, and propose a one-side co-trained spectral clustering technique, which is tailored for the sparse nature of our data. We also propose two evaluation measurements. One is based on the quantitative measure of similarity ratio, while the other employs human evaluators to examine pairs of users, who are selected by the max-risk active evaluation approach. We evaluate our approach on ego networks of twitter users, and present our clustering results. We also compare our proposed clustering technique with single-view clustering and original co-trained spectral clustering techniques. Our results show that multi-view clustering is more accurate for social circle detection; and our proposed approach gains significantly higher similarity ratio than the original multi-view clustering approach. In addition, we build a proof-of-concept implementation of automatic circle detection and recommendation methods. For a user, the system will return its circle detection result from our proposed multi-view clustering technique, and the key words for each circle are also presented. Users can also enter a message they want to post, and the system will suggest which circle to disseminate the message
The structure and dynamics of multilayer networks
In the past years, network theory has successfully characterized the
interaction among the constituents of a variety of complex systems, ranging
from biological to technological, and social systems. However, up until
recently, attention was almost exclusively given to networks in which all
components were treated on equivalent footing, while neglecting all the extra
information about the temporal- or context-related properties of the
interactions under study. Only in the last years, taking advantage of the
enhanced resolution in real data sets, network scientists have directed their
interest to the multiplex character of real-world systems, and explicitly
considered the time-varying and multilayer nature of networks. We offer here a
comprehensive review on both structural and dynamical organization of graphs
made of diverse relationships (layers) between its constituents, and cover
several relevant issues, from a full redefinition of the basic structural
measures, to understanding how the multilayer nature of the network affects
processes and dynamics.Comment: In Press, Accepted Manuscript, Physics Reports 201
Improving Search Engine Results by Query Extension and Categorization
Since its emergence, the Internet has changed the way in which information is distributed and it has strongly influenced how people communicate. Nowadays, Web search engines are widely used to locate information on the Web, and online social networks have become pervasive platforms of communication.
Retrieving relevant Web pages in response to a query is not an easy task for Web search engines due to the enormous corpus of data that the Web stores and the inherent ambiguity of search queries. We present two approaches to improve the effectiveness of Web search engines. The first approach allows us to retrieve more Web pages relevant to a user\u27s query by extending the query to include synonyms and other variations. The second, gives us the ability to retrieve Web pages that more precisely reflect the user\u27s intentions by filtering out those pages which are not related to the user-specified interests.
Discovering communities in online social networks (OSNs) has attracted much attention in recent years. We introduce the concept of subject-driven communities and propose to discover such communities by modeling a community using a posting/commenting interaction graph which is relevant to a given subject of interest, and then applying link analysis on the interaction graph to locate the core members of a community
Ego-centred models of social networks: the social atom
MenciĂłn Internacional en el tĂtulo de doctorThis thesis set out to contribute to the realm of social physics, with a particular
focus on human social networks. Our approach, however, is somewhat
di
erent from what is typical in disciplines such as complex systems or statistical
physics. Rather than simplifying the features of the constituents of
our system (people), and stressing their rules of interaction, we focus on
better understanding those very same constituents, modelling them as social
atoms. Our rationale is that a better understanding of such an atom
may shed light on how (and why) it interacts with other atoms to form
social collectives.
Given its robustness and the evolutionary roots of its premises, we use
the Social Brain Hypothesis as our departure point. This theory states that
the evolutionary drive behind the development of large brains in humans
was the need to process social information and that the limited capacity of
our brains imposes a limit to the number of relationships we can manageâ
the so-called âDunbarâs numberâ, roughly 150. Moreover, evidence keeps
revealing that these relationships are further organised in a series of hierarchically
inclusive layers with decreasing emotional intensity, whose sizes
exhibit a more or less constant scaling. Notwithstanding the empirical evidence,
neither the presence of scaling in the organisation of personal networks
nor its connection with limited cognitive skills had been explained
so far.
In Chapter 2 we present a mathematical model that solves this puzzle.
The assumptions of the model are quite simple, and well founded on empirical
evidence. Firstly, the number of relationships we maintain tends
to be stable on average. Secondly, these relationships are costly, and our resources are limited. With these two premises, our results show that the
hierarchical organisation emerges naturally from the principle of maximum
entropy. Not only that, but we also predict a hitherto unnoticed regime of
organisation whose existence we prove using several datasets from communities
of immigrants.
The former model considers that relationships can only belong to a
discrete set of categories (layers). In Chapter 3 we extend it so that relationships
are classified in a continuum. This modification allows us to test
the model with data from very di
erent sources such as online communications,
face-to-face contacts, and phone calls. Our results show that the two
regimes of organisation found in the previous model persist in this variant,
and reveal the underlying existence of a (universal) scaling parameter
which does not depend on any particular number of layers.
To incorporate these ideas into socio-centric models, we build on the
so-called Structural Balance Theory. This theory, underpinned by psychological
motivations, posits that the structure of social networks of positive
and negative relationships are highly interdependent. However, the theory
has received little empirical validation, and negative social relationships
are poorly understoodâboth from an ego-centric and a socio-centric perspective.
For that reason, we turn to developing an experimental software
in order to gather data within a school.
In Chapters 4 and 5 we present results from these experiments. In
Chapter 4 we analyse the socio-centric networks using machine learning
techniques and find that the structure of positive and negative networks
is indeed very much connected. Besides, we study the two types of networks
separately, showing that they exhibit quite distinct features and that
gender e
ects in negative social networks are weak and asymmetrical for
boys and girls. In Chapter 5, on the other hand, we focus on the structure
of negative personal networks. Remarkably, using data from two di
erent
experimental settings, we show that the structure of personal networks
of negative relationships mirrors that of the positive ones and exhibits a
similar scalingâalbeit their size is significantly smaller.
Chapter 6 summarises our results and presents future (and current) lines
of investigation. Among them, we outline a model of a social fluid that
uses the insights gained with this thesis to build a model of social collectives
as ensembles of personal networks. This model is compatible, at the micro-level, with the observations of the social brain hypothesis, and, at
the macro-level, with the premises of the structural balance theory.This thesis would not have been possible without the support of FundaciĂłn
BBVA through its 2016 call project âLos nĂșmeros de Dunbar y la estructura
de las sociedades digitales: modelizaciĂłn y simulaciĂłn (DUNDIG)â,
and we are very thankful for it. Support for early stages of this work
through projects IBSEN (European Commission, H2020 FET Open RIA
662725) and VARIANCE (Ministerio de EconomĂa y Competitividad/FEDER,
project no. FIS2015-64349-P) is also acknowledgedPrograma Oficial de Doctorado en IngenierĂa MatemĂĄtica por la Universidad Carlos III de MadridPresidente: Javier MartĂn BuldĂș.- Secretario: JosĂ© Luis Molina GonzĂĄlez.- Vocal: Roberta Sinatr
Statistical Analysis of Networks
This book is a general introduction to the statistical analysis of networks, and can serve both as a research monograph and as a textbook. Numerous fundamental tools and concepts needed for the analysis of networks are presented, such as network modeling, community detection, graph-based semi-supervised learning and sampling in networks. The description of these concepts is self-contained, with both theoretical justifications and applications provided for the presented algorithms.
Researchers, including postgraduate students, working in the area of network science, complex network analysis, or social network analysis, will find up-to-date statistical methods relevant to their research tasks. This book can also serve as textbook material for courses related to the
statistical approach to the analysis of complex networks.
In general, the chapters are fairly independent and self-supporting, and the book could be used for course composition âĂ la carteâ. Nevertheless, Chapter 2 is needed to a certain degree for all parts of the book. It is also recommended to read Chapter 4 before reading Chapters 5 and 6, but this is not absolutely necessary. Reading Chapter 3 can also be helpful before reading Chapters 5 and 7.
As prerequisites for reading this book, a basic knowledge in probability, linear algebra and elementary notions of graph theory is advised. Appendices describing required notions from the above mentioned disciplines have been added to help readers gain further understanding
Influence maximization under limited network information: Seeding high-degree neighbors
The diffusion of information, norms, and practices across a social network can be initiated by compelling a small number of seed individuals to adopt first. Strategies proposed in previous work either assume full network information or a large degree of control over what information is collected. However, privacy settings on the Internet and high non-response in surveys often severely limit available connectivity information. Here we propose a seeding strategy for scenarios with limited network information: Only the degrees and connections of some random nodes are known. This new strategy is a modification of ârandom neighbor samplingâ (or âone-hopâ) and seeds the highest-degree neighbors of randomly selected nodes. Simulating a fractional threshold model, we find that this new strategy excels in networks with heavy tailed degree distributions such as scale-free networks and large online social networks. It outperforms the conventional one-hop strategy even though the latter can seed 50% more nodes, and other seeding possibilities including pure high-degree seeding and clustered seeding
Influence maximization under limited network information : seeding high-degree neighbors
Published online: 28 October 2022The diffusion of information, norms, and practices across a social network can be initiated by compelling a small number of seed individuals to adopt first. Strategies proposed in previous work either assume full network information or a large degree of control over what information is collected. However, privacy settings on the Internet and high non-response in surveys often severely limit available connectivity information. Here we propose a seeding strategy for scenarios with limited network information: Only the degrees and connections of some random nodes are known. This new strategy is a modification of 'random neighbor sampling' (or 'one-hop') and seeds the highest-degree neighbors of randomly selected nodes. Simulating a fractional threshold model, we find that this new strategy excels in networks with heavy tailed degree distributions such as scale-free networks and large online social networks. It outperforms the conventional one-hop strategy even though the latter can seed 50% more nodes, and other seeding possibilities including pure high-degree seeding and clustered seeding
Dynamic Credibility Threshold Assignment in Trust and Reputation Mechanisms Using PID Controller
In online shopping buyers do not have enough information about sellers and cannot inspect the products before purchasing them. To help buyers find reliable sellers, online marketplaces deploy Trust and Reputation Management (TRM) systems. These systems aggregate buyersâ feedback about the sellers they have interacted with and about the products they have purchased, to inform users within the marketplace about the sellers and products before making purchases. Thus positive customer feedback has become a valuable asset for each seller in order to attract more business. This naturally creates incentives for cheating, in terms of introducing fake positive feedback. Therefore, an important responsibility of TRM systems is to aid buyers find genuine feedback (reviews) about different sellers.
Recent TRM systems achieve this goal by selecting and assigning credible advisers to any new customer/buyer. These advisers are selected among the buyers who have had experience with a number of sellers and have provided feedback for their services and goods. As people differ in their tastes, the buyer feedback that would be most useful should come from advisers with similar tastes and values. In addition, the advisers should be honest, i.e. provide truthful reviews and ratings, and not malicious, i.e. not collude with sellers to favour them or with other buyers to badmouth some sellers.
Defining the boundary between dishonest and honest advisers is very important. However, currently, there is no systematic approach for setting the honesty threshold which divides benevolent advisers from the malicious ones. The thesis addresses this problem and proposes a market-adaptive honesty threshold management mechanism. In this mechanism the TRM system forms a feedback system which monitors the current status of the e-marketplace. According to the status of the e-marketplace the feedback system improves the performance utilizing PID controller from the field of control systems. The responsibility of this controller is to set the the suitable value of honesty threshold. The results of experiments, using simulation and real-world dataset show that the market-adaptive honesty threshold allows to optimize the performance of the marketplace with respect to throughput and buyer satisfaction
- âŠ