40 research outputs found
Spying the World from your Laptop -- Identifying and Profiling Content Providers and Big Downloaders in BitTorrent
This paper presents a set of exploits an adversary can use to continuously
spy on most BitTorrent users of the Internet from a single machine and for a
long period of time. Using these exploits for a period of 103 days, we
collected 148 million IPs downloading 2 billion copies of contents. We identify
the IP address of the content providers for 70% of the BitTorrent contents we
spied on. We show that a few content providers inject most contents into
BitTorrent and that those content providers are located in foreign data
centers. We also show that an adversary can compromise the privacy of any peer
in BitTorrent and identify the big downloaders that we define as the peers who
subscribe to a large number of contents. This infringement on users' privacy
poses a significant impediment to the legal adoption of BitTorrent
Pushing BitTorrent Locality to the Limit
Peer-to-peer (P2P) locality has recently raised a lot of interest in the
community. Indeed, whereas P2P content distribution enables financial savings
for the content providers, it dramatically increases the traffic on inter-ISP
links. To solve this issue, the idea to keep a fraction of the P2P traffic
local to each ISP was introduced a few years ago. Since then, P2P solutions
exploiting locality have been introduced. However, several fundamental issues
on locality still need to be explored. In particular, how far can we push
locality, and what is, at the scale of the Internet, the reduction of traffic
that can be achieved with locality? In this paper, we perform extensive
experiments on a controlled environment with up to 10 000 BitTorrent clients to
evaluate the impact of high locality on inter-ISP links traffic and peers
download completion time. We introduce two simple mechanisms that make high
locality possible in challenging scenarios and we show that we save up to
several orders of magnitude inter-ISP traffic compared to traditional locality
without adversely impacting peers download completion time. In addition, we
crawled 214 443 torrents representing 6 113 224 unique peers spread among 9 605
ASes. We show that whereas the torrents we crawled generated 11.6 petabytes of
inter-ISP traffic, our locality policy implemented for all torrents would have
reduced the global inter-ISP traffic by 40%
I Know Where You are and What You are Sharing: Exploiting P2P Communications to Invade Users' Privacy
In this paper, we show how to exploit real-time communication applications to
determine the IP address of a targeted user. We focus our study on Skype,
although other real-time communication applications may have similar privacy
issues. We first design a scheme that calls an identified targeted user
inconspicuously to find his IP address, which can be done even if he is behind
a NAT. By calling the user periodically, we can then observe the mobility of
the user. We show how to scale the scheme to observe the mobility patterns of
tens of thousands of users. We also consider the linkability threat, in which
the identified user is linked to his Internet usage. We illustrate this threat
by combining Skype and BitTorrent to show that it is possible to determine the
file-sharing usage of identified users. We devise a scheme based on the
identification field of the IP datagrams to verify with high accuracy whether
the identified user is participating in specific torrents. We conclude that any
Internet user can leverage Skype, and potentially other real-time communication
systems, to observe the mobility and file-sharing usage of tens of millions of
identified users.Comment: This is the authors' version of the ACM/USENIX Internet Measurement
Conference (IMC) 2011 pape
Clustering in P2P exchanges and consequences on performances.
We propose here an analysis of a rich dataset which gives an exhaustive and dynamic view of the exchanges processed in a running eDonkey system. We focus on correlation in term of data exchanged by peers having provided or queried at least one data in common. We introduce a method to capture these correlations (namely the data clustering), and study it in detail. We then use it to propose a very simple and efficient way to group data into clusters and show the impact of this underlying structure on search in typical P2P systems. Finally, we use these results to evaluate the relevance and limitations of a model proposed in a previous publication. We indicate some realistic values for the parameters of this model, and discuss some possible improvements
Statistical analysis of a P2P query graph based on degrees and their time-evolution
Despite their crucial impact on the performances of peer-to-peer systems, very few is known on peers behaviors in such networks. We propose here a study of these of these behaviors in a running environment using a semi-centralised p2p system (edonkey). To achieve this, we use a trace of the queries made to a large server managing up to fifty thousands peers simultaneously, and a few thousands query per second. We analyse these data using complex network methods, and focus in particular on the degrees, their correlations, and their time-evolution. Results show a large variety of observed phenomena, including the variety of peers behaviors and heterogeneity of data queries, which should be taken into account when designing p2p systems
De-anonymizing BitTorrent Users on Tor
Some BitTorrent users are running BitTorrent on top of Tor to preserve their
privacy. In this extended abstract, we discuss three different attacks to
reveal the IP address of BitTorrent users on top of Tor. In addition, we
exploit the multiplexing of streams from different applications into the same
circuit to link non-BitTorrent applications to revealed IP addresses.Comment: Poster accepted at the 7th USENIX Symposium on Network Design and
Implementation (NSDI '10), San Jose, CA : United States (2010
Compromising Tor Anonymity Exploiting P2P Information Leakage
Privacy of users in P2P networks goes far beyond their current usage and is a
fundamental requirement to the adoption of P2P protocols for legal usage. In a
climate of cold war between these users and anti-piracy groups, more and more
users are moving to anonymizing networks in an attempt to hide their identity.
However, when not designed to protect users information, a P2P protocol would
leak information that may compromise the identity of its users. In this paper,
we first present three attacks targeting BitTorrent users on top of Tor that
reveal their real IP addresses. In a second step, we analyze the Tor usage by
BitTorrent users and compare it to its usage outside of Tor. Finally, we depict
the risks induced by this de-anonymization and show that users' privacy
violation goes beyond BitTorrent traffic and contaminates other protocols such
as HTTP
Finding Good Partners in Availability-aware P2P Networks
In this paper, we study the problem of finding peers matching a given availability pattern in a peer-to-peer (P2P) system. We first prove the existence of such patterns in a new trace of the eDonkey network, containing the sessions of 14M peers over 27 days. We also show that, using only 7 days of history, a simple predictor can select predictable peers and successfully predict their online periods for the next week. Then, motivated by practical examples, we specify two formal problems of availability matching that arise in real applications: disconnection matching, where peers look for partners expected to disconnect at the same time, and presence matching, where peers look for partners expected to be online simultaneously in the future. As a scalable and inexpensive solution, we propose to use epidemic protocols for topology management, such as T-Man; we provide corresponding metrics for both matching problems. Finally, we evaluated this solution by simulating two P2P applications over our real trace: task scheduling and file storage. Simulations showed that our simple solution provided good partners fast enough to match the needs of both applications, and that consequently, these applications performed as efficiently at a much lower cost. We believe that this work will be useful for many P2P applications for which it has been shown that choosing good partners, based on their availability, drastically improves their efficiency
One Bad Apple Spoils the Bunch: Exploiting P2P Applications to Trace and Profile Tor Users
Tor is a popular low-latency anonymity network. However, Tor does not protect
against the exploitation of an insecure application to reveal the IP address
of, or trace, a TCP stream. In addition, because of the linkability of Tor
streams sent together over a single circuit, tracing one stream sent over a
circuit traces them all. Surprisingly, it is unknown whether this linkability
allows in practice to trace a significant number of streams originating from
secure (i.e., proxied) applications. In this paper, we show that linkability
allows us to trace 193% of additional streams, including 27% of HTTP streams
possibly originating from "secure" browsers. In particular, we traced 9% of Tor
streams carried by our instrumented exit nodes. Using BitTorrent as the
insecure application, we design two attacks tracing BitTorrent users on Tor. We
run these attacks in the wild for 23 days and reveal 10,000 IP addresses of Tor
users. Using these IP addresses, we then profile not only the BitTorrent
downloads but also the websites visited per country of origin of Tor users. We
show that BitTorrent users on Tor are over-represented in some countries as
compared to BitTorrent users outside of Tor. By analyzing the type of content
downloaded, we then explain the observed behaviors by the higher concentration
of pornographic content downloaded at the scale of a country. Finally, we
present results suggesting the existence of an underground BitTorrent ecosystem
on Tor
Tiresias: Predicting Security Events Through Deep Learning
With the increased complexity of modern computer attacks, there is a need for
defenders not only to detect malicious activity as it happens, but also to
predict the specific steps that will be taken by an adversary when performing
an attack. However this is still an open research problem, and previous
research in predicting malicious events only looked at binary outcomes (e.g.,
whether an attack would happen or not), but not at the specific steps that an
attacker would undertake. To fill this gap we present Tiresias, a system that
leverages Recurrent Neural Networks (RNNs) to predict future events on a
machine, based on previous observations. We test Tiresias on a dataset of 3.4
billion security events collected from a commercial intrusion prevention
system, and show that our approach is effective in predicting the next event
that will occur on a machine with a precision of up to 0.93. We also show that
the models learned by Tiresias are reasonably stable over time, and provide a
mechanism that can identify sudden drops in precision and trigger a retraining
of the system. Finally, we show that the long-term memory typical of RNNs is
key in performing event prediction, rendering simpler methods not up to the
task