66 research outputs found

    XRay: Enhancing the Web's Transparency with Differential Correlation

    Get PDF
    Today's Web services - such as Google, Amazon, and Facebook - leverage user data for varied purposes, including personalizing recommendations, targeting advertisements, and adjusting prices. At present, users have little insight into how their data is being used. Hence, they cannot make informed choices about the services they choose. To increase transparency, we developed XRay, the first fine-grained, robust, and scalable personal data tracking system for the Web. XRay predicts which data in an arbitrary Web account (such as emails, searches, or viewed products) is being used to target which outputs (such as ads, recommended products, or prices). XRay's core functions are service agnostic and easy to instantiate for new services, and they can track data within and across services. To make predictions independent of the audited service, XRay relies on the following insight: by comparing outputs from different accounts with similar, but not identical, subsets of data, one can pinpoint targeting through correlation. We show both theoretically, and through experiments on Gmail, Amazon, and YouTube, that XRay achieves high precision and recall by correlating data from a surprisingly small number of extra accounts.Comment: Extended version of a paper presented at the 23rd USENIX Security Symposium (USENIX Security 14

    How long does it take for all users in a social network to choose their communities?

    Get PDF
    We consider a community formation problem in social networks, where the users are either friends or enemies. The users are partitioned into conflict-free groups (i.e., independent sets in the conflict graph G^- =(V,E) that represents the enmities between users). The dynamics goes on as long as there exists any set of at most k users, k being any fixed parameter, that can change their current groups in the partition simultaneously, in such a way that they all strictly increase their utilities (number of friends i.e., the cardinality of their respective groups minus one). Previously, the best-known upper-bounds on the maximum time of convergence were O(|V|alpha(G^-)) for k = 4, was conjectured to be polynomial [Escoffier et al., 2012][Kleinberg and Ligett, 2013]. In this paper we disprove this. Specifically, we prove that for any k >= 4, the maximum time of convergence is an Omega(|V|^{Theta(log{|V|})})

    A Closed Form Formula for Long-lived TCP Connections Throughput

    Get PDF
    In this paper, we study the variation of the throughput achieved by TCP resulting from both the individual behavior of a connection and the interactio- n with all other connections sharing the same link. In particular, we calculate the Tail Distribution Function (TDF) of the instantaneous throughput seen by one TCP connection in the Additive Increase Multiplicative Decrease (AIMD) framework. For the particular case that each TCP connection experiences the same Round Trip Time (RTT) and under the many user approximati- on we prove that this TDF is given by a closed-form formula that solely depends on the network parameters (number of sources, capacity and buffer size of the bottleneck link). This formula can then be used as a dimensioning tool, where throughput is guaranteed to each user to be «larger than a given value for at least a certain percentage of the time». In the context defined here, this formula plays the same role for the dimensioning of an IP router as the Erlang B formula does for the dimensioning of a PSTN switch

    Non-existence of stable social groups in information-driven networks

    Get PDF
    International audienceWe study a group-formation game on an undirected complete graph G with all edge-weights in a set W ⊆ R ∪ {−∞}. This work is motivated by a recent information-sharing model for social networks (Kleinberg and Ligett, GEB, 2013). Specifically, we consider partitions of the vertex-set of G into groups. The individual utility of any vertex v is the sum of the weights on the edges uv between v and the other vertices u in her group.-Informally, u and v represent social users, and the weight of uv quantifies the extent to which u and v (dis)agree on some fixed topic.-For a fixed integer k ≥ 1, a k-stable partition is a partition in which no coalition of at most k vertices would increase their respective utilities by leaving their groups to join or create another common group. Before our work, it was known that such a partition always exists if k = 1 (Burani and Zwicker, Math. Soc. Sci., 2003). We focus on the regime k ≥ 2.• Our first result is that when all the social users are either friends, enemies or indifferent to each other (i.e., W = {−∞, 0, 1}), a partition as above always exists if k ≤ 2, but it may not exist if k ≥ 3. This is in sharp contrast with (Kleinberg and Ligett, GEB, 2013) who proved that k-stable partitions always exist, for any k, if W = {−∞, 1}.• We further study the intriguing relationship between the existence of k-stable partitions and the allowed set of edge-weights W. Specifically, we give sufficient conditions for the existence or the non existence of such partitions based on tools from Graph Theory. Doing so, we obtain for most sets W the largest k(W) such that all graphs with edge-weights in W admit a k(W)-stable partition.• From the computational point of view, we prove that for any W containing −∞, the problem of deciding whether a k-stable partition exists is NP-complete for any k > k(W).Our work hints that the emergence of stable communities in a social network requires a trade-off between the level of collusion between social users, and the diversity of their opinions

    Impact of Network Delay Variation on Multicast Session Performance With TCP-like Congestion Control

    Get PDF
    Projet MCRWe study the impact of random noise (queueing delay) on the performance of a multicast session. With a simple analytical model, we analyze the throughput degradation within a multicast (one-to-many) tree under TCP-like congestion and flow control. We use the (max,plus) formalism together with methods based on stochastic comparison (association and convex ordering) and on the theory of extremes (Lai and Robbins' notion of maximal characterist- ics) to prove various properties of the throughput. We first prove that the throughput obtained from Golestani's deterministic model [1] is systematic- ally optimistic. In presence of light tailed random noise, we show that the throughput decreases like the inverse of the logarithm of the number of receivers. We find an analytical upper and a lower bound for the throughput degradation. Within these bounds, we characterize the degradation which is obtained for various tree topologies. In particular, we observe that a class of trees commonly found in IP multicast sessions [9] (which we call umbrella trees) is significantly more sensitive to network noise than other topologies

    Chasm in Hegemony: Explaining and Reproducing Disparities in Homophilous Networks

    Full text link
    In networks with a minority and a majority community, it is well-studied that minorities are under-represented at the top of the social hierarchy. However, researchers are less clear about the representation of minorities from the lower levels of the hierarchy, where other disadvantages or vulnerabilities may exist. We offer a more complete picture of social disparities at each social level with empirical evidence that the minority representation exhibits two opposite phases: at the higher rungs of the social ladder, the representation of the minority community decreases; but, lower in the ladder, which is more populous, as you ascend, the representation of the minority community improves. We refer to this opposing phenomenon between the upper-level and lower-level as the \emph{chasm effect}. Previous models of network growth with homophily fail to detect and explain the presence of this chasm effect. We analyze the interactions among a few well-observed network-growing mechanisms with a simple model to reveal the sufficient and necessary conditions for both phases in the chasm effect to occur. By generalizing the simple model naturally, we present a complete bi-affiliation bipartite network-growth model that could successfully capture disparities at all social levels and reproduce real social networks. Finally, we illustrate that addressing the chasm effect can create fairer systems with two applications in advertisement and fact-checks, thereby demonstrating the potential impact of the chasm effect on the future research of minority-majority disparities and fair algorithms

    Social Clicks: What and Who Gets Read on Twitter?

    Get PDF
    International audienceOnline news domains increasingly rely on social media to drive traffic to their websites. Yet we know surprisingly little about how a social media conversation mentioning an online article actually generates clicks. Sharing behaviors, in contrast, have been fully or partially available and scrutinized over the years. While this has led to multiple assumptions on the diffusion of information, each assumption was designed or validated while ignoring actual clicks. We present a large scale, unbiased study of social clicks - that is also the first data of its kind - gathering a month of web visits to online resources that are located in 5 leading news domains and that are mentioned in the third largest social media by web referral (Twitter). Our dataset amounts to 2.8 million shares, together responsible for 75 billion potential views on this social media, and 9.6 million actual clicks to 59,088 unique resources. We design a reproducible methodology and carefully correct its biases. As we prove, properties of clicks impact multiple aspects of information diffusion, all previously unknown. (i) Secondary resources, that are not promoted through headlines and are responsible for the long tail of content popularity, generate more clicks both in absolute and relative terms. (ii) Social media attention is actually long-lived, in contrast with temporal evolution estimated from shares or receptions. (iii) The actual influence of an intermediary or a resource is poorly predicted by their share count, but we show how that prediction can be made more precise
    • …
    corecore