11 research outputs found
Spying the World from your Laptop -- Identifying and Profiling Content Providers and Big Downloaders in BitTorrent
This paper presents a set of exploits an adversary can use to continuously
spy on most BitTorrent users of the Internet from a single machine and for a
long period of time. Using these exploits for a period of 103 days, we
collected 148 million IPs downloading 2 billion copies of contents. We identify
the IP address of the content providers for 70% of the BitTorrent contents we
spied on. We show that a few content providers inject most contents into
BitTorrent and that those content providers are located in foreign data
centers. We also show that an adversary can compromise the privacy of any peer
in BitTorrent and identify the big downloaders that we define as the peers who
subscribe to a large number of contents. This infringement on users' privacy
poses a significant impediment to the legal adoption of BitTorrent
BitTorrent Sync: Network Investigation Methodology
The volume of personal information and data most Internet users find
themselves amassing is ever increasing and the fast pace of the modern world
results in most requiring instant access to their files. Millions of these
users turn to cloud based file synchronisation services, such as Dropbox,
Microsoft Skydrive, Apple iCloud and Google Drive, to enable "always-on" access
to their most up-to-date data from any computer or mobile device with an
Internet connection. The prevalence of recent articles covering various
invasion of privacy issues and data protection breaches in the media has caused
many to review their online security practices with their personal information.
To provide an alternative to cloud based file backup and synchronisation,
BitTorrent Inc. released an alternative cloudless file backup and
synchronisation service, named BitTorrent Sync to alpha testers in April 2013.
BitTorrent Sync's popularity rose dramatically throughout 2013, reaching over
two million active users by the end of the year. This paper outlines a number
of scenarios where the network investigation of the service may prove
invaluable as part of a digital forensic investigation. An investigation
methodology is proposed outlining the required steps involved in retrieving
digital evidence from the network and the results from a proof of concept
investigation are presented.Comment: 9th International Conference on Availability, Reliability and
Security (ARES 2014
Compromising Tor Anonymity Exploiting P2P Information Leakage
Privacy of users in P2P networks goes far beyond their current usage and is a
fundamental requirement to the adoption of P2P protocols for legal usage. In a
climate of cold war between these users and anti-piracy groups, more and more
users are moving to anonymizing networks in an attempt to hide their identity.
However, when not designed to protect users information, a P2P protocol would
leak information that may compromise the identity of its users. In this paper,
we first present three attacks targeting BitTorrent users on top of Tor that
reveal their real IP addresses. In a second step, we analyze the Tor usage by
BitTorrent users and compare it to its usage outside of Tor. Finally, we depict
the risks induced by this de-anonymization and show that users' privacy
violation goes beyond BitTorrent traffic and contaminates other protocols such
as HTTP
Why are they hiding ? Study of an Anonymous File Sharing System
International audienceThis paper characterizes a recently proposed anonymous file sharing system, OneSwarm. This characterisation is based on measurement of several aspects of the OneSwarm system such as the nature of the shared and searched content and the geolocation and number of users. Our findings indicate that, as opposed to common belief, there is no significant difference in downloaded content between this system and the classical BitTorrent ecosystem. We also found that a majority of users appears to be located in countries where anti-piracy laws have been recently adopted and enforced (France, Sweden and U.S). Finally, we evaluate the level of privacy provided by OneSwarm, and show that, although the system has strong overall privacy, a collusion attack could potentially identify content providers
I Know Where You are and What You are Sharing: Exploiting P2P Communications to Invade Users' Privacy
In this paper, we show how to exploit real-time communication applications to
determine the IP address of a targeted user. We focus our study on Skype,
although other real-time communication applications may have similar privacy
issues. We first design a scheme that calls an identified targeted user
inconspicuously to find his IP address, which can be done even if he is behind
a NAT. By calling the user periodically, we can then observe the mobility of
the user. We show how to scale the scheme to observe the mobility patterns of
tens of thousands of users. We also consider the linkability threat, in which
the identified user is linked to his Internet usage. We illustrate this threat
by combining Skype and BitTorrent to show that it is possible to determine the
file-sharing usage of identified users. We devise a scheme based on the
identification field of the IP datagrams to verify with high accuracy whether
the identified user is participating in specific torrents. We conclude that any
Internet user can leverage Skype, and potentially other real-time communication
systems, to observe the mobility and file-sharing usage of tens of millions of
identified users.Comment: This is the authors' version of the ACM/USENIX Internet Measurement
Conference (IMC) 2011 pape
Vulnรฉrabilitรฉs de la DHT de BitTorrent & Identification des comportements malveillants dans KAD
Le prรฉsent dรฉlivrable prรฉsente les rรฉsultats des travaux menรฉs durant les six premiers mois (T0+6) du projet GIS 3SGS ACDAP2P dont l'objectif est de proposer une architecture collaborative pour la dรฉtection d'attaques dans les rรฉseaux pair ร pair. Nous dรฉtaillons dans ce rapport nos travaux concernant l'identification des comportements malveillants affectant le rรฉseaux KAD (tรขche T2) ainsi que l'identification des vulnรฉrabilitรฉs affectant la DHT du rรฉseau BitTorrent (tรขche T3) qui sont au coeur du projet ACDAP2P. Pour introduire nos travaux, nous prรฉsentons tout d'abord leur contexte ainsi qu'une taxonomie des diffรฉrentes attaques pouvant affecter les DHT.. Notre premiรจre contribution montre ร travers plusieurs expรฉriences que des failles de sรฉcuritรฉ permettent la rรฉalisation d'attaques efficaces pouvant altรฉrer le bon fonctionnement de la DHT de BitTorrent. En prenant pour cas d'รฉtude le rรฉseau P2P KAD, nous recensons ensuite les pairs suspects en utilisant deux approches de dรฉtection et montrons ainsi que des milliers de contenus du rรฉseau sont attaquรฉs durant nos mesures. Finalement, nous constatons l'รฉphรฉmรฉritรฉ de certains attaquants dans le rรฉseau
BitTorrent ์์คํ ์์ ์ปจํ ํธ ๋ฒ๋ค๋ง ๋ฐ ๋ฐฐํฌ
ํ์๋
ผ๋ฌธ (๋ฐ์ฌ)-- ์์ธ๋ํ๊ต ๋ํ์ : ์ ๊ธฐยท์ปดํจํฐ๊ณตํ๋ถ, 2013. 2. ์ต์ํฌ.BitTorrent๋ ์ปจํ
ํธ ๊ณต์ ์ ์ฌ์ฉ๋๋ ๊ฐ์ฅ ์ธ๊ธฐ์๋ ์ธํฐ๋ท ์ํํธ์จ์ด์ด๋ค. BitTorrent๊ฐ ๋๋ฆฌ ์ฌ์ฉ๋จ์ ๋ฐ๋ผ, ์ฐ๊ตฌ์๋ค์ BitTorrent์ ์ฒ๋ฆฌ๋, ๊ณต์ ์ฑ, ์ธ์ผํฐ๋ธ์ ๊ฐ์ ์ด์์ ๋ํด ์ฐ๊ตฌํด ์๊ณ , ์ด๋ฌํ ์ฐ๊ตฌ๋ค์ BitTorrent ์ฑ๋ฅ๊ณผ ๊ด๋ จ๋ ๊ฐ์น์๋ ๊ฒฐ๊ณผ๋ค์ ๋ณด์ฌ์ฃผ์๋ค. ํ์ง๋ง ๋๋ถ๋ถ์ ์ฐ๊ตฌ์์๋, BitTorrent์์์ ์ปจํ
ํธ ๋ฒ๋ค๋ง ๋ฐ ๋ฐฐํฌ ์ ๋ต๊ณผ ๊ด๋ จํด์ (1) BitTorrent ๋ฐฐํฌ์๊ฐ ํ์ผ์ ์ด๋ค ๋ชฉ์ ์ผ๋ก ์ด๋ป๊ฒ ๋ฒ๋ค ํ๋์ง์ (2) BitTorrent์ ๋ฐฐํฌ์๋ค์ด ๊ทธ๋ค์ ๋ชฉ์ ์ ์ฑ์ทจํ๊ธฐ ์ํด ์ด๋ ํ ์ ๋ต๋ค์ ์ฌ์ฉํ๋์ง ๋ฑ์ ๋ํด ๋ค๋ฃจ๊ณ ์์ง ์๋ค.
๋ณธ ํ์ ๋
ผ๋ฌธ์์๋, ์์ ์ธ๊ธํ ๋ฌธ์ ๋ค์ ์ธก์ ๋ ๋ฐ์ดํฐ๋ฅผ ๋ฐํ์ผ๋ก ์กฐ์ฌํ๊ธฐ ์ํด์, BitTorrent ํฌํ์ค ๊ฐ์ฅ ํฐ ๊ท๋ชจ์ธ The Pirate Bay (TPB)์ ๋ํ ์ข
ํฉ์ ์ธ ์ธก์ ์ฐ๊ตฌ๋ฅผ ์ํํ์๋ค. ์ธก์ ๋ ๋ฐ์ดํฐ์
์ 12๋ง๊ฐ์ ํ ๋ฐํธ์ 1600๋ง๋ช
์ ์ฌ์ฉ์๋ก ๊ตฌ์ฑ๋์๊ณ , ์ปจํ
ํธ ๋ฐฐํฌ์๋ฅผ (i) ๊ฐ์ง ๋ฐฐํฌ์, (ii) ์ด์ค์ถ๊ตฌ ๋ฐฐํฌ์, (iii) ์ดํ์ ๋ฐฐํฌ์ ์ธ๊ฐ์ง ์ข
๋ฅ๋ก ๋ถ๋ฅํ์ฌ ์ฐ๊ตฌ๋ฅผ ์งํํ์๋ค. ๋ํ ์ํ, TV, ์ฑ์ธ๋ฌผ, ์์
, ์์ฉํ๋ก๊ทธ๋จ, ๊ฒ์, ์ ์์ฑ
๊ณผ ๊ฐ์ ์ปจํ
ํธ ์นดํ
๊ณ ๋ฆฌ์ ๋ฐ๋ผ ๋ฒ๋ค๋ง๊ณผ ์ปจํ
ํธ ๋ฐฐํฌ ํํฉ์ด ์ด๋ป๊ฒ ๋๋์ง ์กฐ์ฌํ์๋ค.
์ฒซ๋ฒ์งธ๋ก, ํ ๋ฐํธ์ ๊ตฌ์กฐ์ ํจํด๊ณผ ์ค์ ์ฐธ์ฌ์์ ํ๋ ํจํด์ ํ์
ํ๊ธฐ ์ํด ์ปจํ
ํธ ๋ฒ๋ค๋ง๊ณผ ๊ด๋ จ๋ ํํฉ์ ์กฐ์ฌํ์๋ค. ํน๋ณํ, (1) ์ผ๋ง๋ ์ปจํ
ํธ ๋ฒ๋ค๋ง์ด ๋๋ฆฌ ์ฌ์ฉ๋๋๊ฐ, (2) ์ด๋ค ํ์ผ๋ค์ด ์ด๋ป๊ฒ ํ ๋ฐํธ๋ก ๋ฒ๋ค๋๋๊ฐ, (3) ์ ๋ฐฐํฌ์๋ค์ด ํ์ผ์ ๋ฒ๋คํด์ ์ฌ์ฉํ๋๊ฐ, (4) ์ฌ์ฉ์๋ค์ด ๋ฒ๋ค๋ ํ์ผ๋ค์ ์ด๋ป๊ฒ ๋ค์ด๋ก๋ ๋ฐ๋๊ฐ์ ์ด์ ์ ๋ง์ถ์ด ์ฐ๊ตฌ๋ฅผ ์ํํ์๋ค. ์ธก์ ๊ฒฐ๊ณผ 72% ์ด์์ ํ ๋ฐํธ๋ค์ด ์ฌ๋ฌ๊ฐ์ ํ์ผ๋ก ๊ตฌ์ฑ๋์ด ์๋ ๊ฒ์ ์ ์ ์์๊ณ , ์ด๊ฒ์ ๋ฒ๋ค์ด BitTorrent์ ํ์ผ ๊ณต์ ๋ฅผ ์ํด ๋๋ฆฌ ์ฌ์ฉ๋๊ณ ์์์ ๋ณด์ฌ์ค๋ค. ๊ทธ๋ฆฌ๊ณ ๊ฒฝ์ ์ ์ธ ์ด๋์ ์ํด ์น์ฌ์ดํธ๋ฅผ ๊ด๊ณ ํ๋ ์ด์ค์ถ๊ตฌ ๋ฐฐํฌ์๋ค์ด ๋ฒ๋ค์ ์ ํธํ์ฌ ์ฌ์ฉํ๋ ๊ฒฝํฅ์ด ์์์ ์ ์ ์์๋ค. ๋ํ ๋ฒ๋ค๋ ํ ๋ฐํธ์ ๋๋ถ๋ถ์ ํ์ผ(94%)์ด ์ฌ์ฉ์๋ค์ ์ํด ์ ํ๋๊ณ , ๋ฒ๋ค๋ ํ ๋ฐํธ๊ฐ ๋ฒ๋ค์ด ์๋ ํ ๋ฐํธ๋ณด๋ค ํ๊ท ์ ์ผ๋ก ๋ ์ธ๊ธฐ๊ฐ ์ข์์ ์ ์ ์์๋ค. ์ ์ฒด์ ์ผ๋ก, ํ ๋ฐํธ์ ๊ตฌ์กฐ์ ํจํด๊ณผ ์ค์ ์ฐธ์ฌ์์ ํน์ง์ ์ปจํ
ํธ์ ์นดํ
๊ณ ๋ฆฌ ์ข
๋ฅ์ ๋ฐ๋ผ์, ๊ทธ๋ฆฌ๊ณ ๋ฒ๋ค๋ ํ ๋ฐํธ์ธ์ง ๋ฒ๋ค๋์ง ์์ ํ ๋ฐํธ์ธ์ง์ ๋ฐ๋ผ์ ์ฃผ๋ชฉํ ๋งํ ์ฐจ์ด์ ์ด ์์์ ๋ฐ๊ฒฌํ ์ ์์๋ค.
๋ค์์ผ๋ก, ์ฌํ๊ฒฝ์ ์ ๊ด์ ์์ BitTorrent์ ์ปจํ
ํธ ๋ฐฐํฌ ํจํด์ (1) ๋ฐฐํฌ์์ ์ํด์ ํ์ผ์ด ์ด๋ป๊ฒ ๋ฐฐํฌ๋๋๊ฐ, (2) ๊ฐ ๋ฐฐํฌ์๋ค์ ์ด๋ค ์ ๋ต๋ค์ ์ฌ์ฉํ๋๊ฐ, (3) ๋ฐฐํฌ ์ ๋ต๋ค์ด ์ผ๋ง๋ ํจ๊ณผ๊ฐ ์๋๊ฐ์ ์ธก๋ฉด์์ ์กฐ์ฌํ์๋ค. ์ธก์ ๊ฒฐ๊ณผ ์๋นํ ์์ ํธ๋ํฝ(61%)์ด ๊ฐ์ง ํ ๋ฐํธ๋ฅผ ๋ค์ด๋ฐ์ ๋ ๋ฐ์ํ๊ณ ์๋ ๊ฒ์ ์ ์ ์์๊ณ , ์ด๋ ๋ง์ ์์ ์ธํฐ๋ท ํธ๋ํฝ์ด ๋ถํ์ํ๊ฒ ๋ญ๋น๋๊ณ ์์์ ๋ณด์ฌ ์ฃผ๋ ๊ฒ์ด๋ค. ๋ฐ๋ผ์ ๋ณธ ์ธก์ ๊ฒฐ๊ณผ๋ก๋ถํฐ ์ ์ ์๋ ๊ฐ์ง ๋ฐฐํฌ์๋ค์ ๋ฐฐํฌ ํจํด์ ๊ณ ๋ คํด์ TPB์ ๊ฐ์ง ๋ฐฐํฌ์๋ฅผ ๊ฑธ๋ฌ๋ผ ์ ์๋ ๋ฐฉ๋ฒ์ ์ ์ํ์๊ณ , ์ ์๋ ๋ฐฉ๋ฒ์ด ์ ์ฒด ๋ค์ด๋ก๋ ํธ๋ํฝ์ 45% ๊ฐ๋์ ์ค์ผ ์ ์์์ ๋ณด์ฌ ์ฃผ์๋ค. ๋ํ ์ด์ค์ถ๊ตฌ ๋ฐฐํฌ์๋ค์ ๊ทธ๋ค์ ์์ต๋ชจ๋ธ(์๋ฅผ ๋ค์ด, ๊ฐ์ธ ํธ๋์ปค ์ฌ์ดํธ์ ์๋ก์ด ์ฌ์ฉ์๋ฅผ ์์
ํ๋ ๊ฒ์ด๋ ์ฌ๋๋ค์ด ์ฌ์ง๊ณผ ์ฐ๊ฒฐ๋ URL ๋งํฌ๋ฅผ ํด๋ฆญํ๋๋ก ํ๋ ๊ฒ)์ ๋ฐ๋ผ ๋ค๋ฅธ ๋ฐฐํฌ ์ ๋ต์ ์ด์ฉํ๊ณ ์์์ ์ ์ ์์๋ค.BitTorrent is one of the most popular applications for sharing contents over the Internet. The huge success of BitTorrent has attracted the research community to investigate BitTorrent's behavior in terms of throughput, fairness, and incentive issues, revealing valuable insights into the performance aspects of BitTorrent. However, most of these studies paid little attention to understand content bundling and publishing strategies in BitTorrent from the following perspectives: (1) how, and for what purposes, are constituent files bundled by BitTorrent publishers? and (2) what strategies are adopted by BitTorrent publishers to achieve their goals?
To answer these questions with data from a large-scale BitTorrent system, we conduct comprehensive measurements on one of the largest BitTorrent portals: the Pirate Bay (TPB). From the datasets of the 120 K torrents and 16 M peers, we classify BitTorrent publishers into three types: (i) fake publishers, (ii) profit-driven publishers, and (iii) altruistic publishers. Throughout this dissertation, we investigate the current practice of bundling and publishing across different content categories: Movie, TV, Porn, Music, Application, Game, and E-book.
We first investigate the current practice of content bundling to understand the structural patterns of torrents and the participant behaviors of swarms. In particular, we focus on: (1) how prevalent content bundling is, (2) how and what files are bundled into torrents, (3) what motivates publishers to bundle files, and (4) how peers access the bundled files. We find that over 72% of BitTorrent torrents contain multiple files, which indicates that bundling is widely used for file sharing. We reveal that profit-driven BitTorrent publishers who promote their own web sites for financial gains like advertising tend to prefer to use the bundling. We also observe that most files (94%) in a bundle torrent are selected by users and the bundle torrents are more popular than the single (or non-bundle) ones on average. Overall, there are notable differences in the structural patterns of torrents and swarm characteristics (i) across different content categories and (ii) between single and bundle torrents.
We next investigate the current practice of content publishing in BitTorrent from a socio-economic point of view, by unraveling (1) how files are published by publishers, (2) what strategies are adopted by publishers, and (3) how effective those strategies are. We show that a significant amount of traffic (61%) of BitTorrent has been generated (i.e., unnecessarily wasted) to download fake torrents. Therefore, we suggest a method to filter out fake publishers on TPB by considering their distinct publishing patterns learned from our measurement study, and show that the proposed method can reduce around 45% of the total download traffic. We also reveal that profit-driven publishers adopt different publishing strategies according to their revenue models (e.g., advertising private tracker sites to attract potential new members, or exposing image URLs to make people click the URL links).Abstract i
I. Introduction 1
II. Related Work 5
2.1 Multi-torrent Systems 5
2.2 Bundling in BitTorrent 6
2.3 Bundling in Economics 7
2.4 Content publishing in BitTorrent 7
III. Methodology 9
3.1 Measurement Methodology 9
3.2 Publisher Classification 11
IV. Bundling Practice in BitTorrent: What, How, and Why 14
4.1 Introduction 14
4.2 Datasets 16
4.2.1 Torrent Datasets 17
4.2.2 Swarm Datasets 17
4.3 Single vs. Bundle 18
4.3.1 Bundling is widespread 18
4.3.2 How files are bundled 20
4.4 Main File Analysis in Bundling 27
4.4.1 Identifying Main Files 28
4.4.2 Constituents of Bundle-k 29
4.5 Publisher Analysis 32
4.5.1 Contribution of Top-20 Publishers 33
4.5.2 Cross-category Publishing of Top-20 Publishers 39
4.6 User Access Pattern Analysis 40
4.6.1 Popularity Analysis 40
4.6.2 Availability Analysis 43
4.6.3 The Number of Files Requested by Users in a Bundle Torrent 44
4.6.4 Swarm Behaviors versus Bundle-k 47
4.7 Discussions 50
V. Content Publishing Practice in BitTorrent 52
5.1 Introduction 52
5.2 The Number of Published Torrents 54
5.3 Publishers Strategies 58
5.3.1 Lifetime of Publishers and their Publishing Rates 59
5.3.2 Content Categories 60
5.3.3 Advertising Strategies of Profit-driven Publishers 63
5.4 Downloaders Behavior 64
5.5 Implications on Publishers Strategies 69
5.5.1 Fake Publishers 69
5.5.2 Profit-driven Publishers 71
VI. Summary & Future Work 73
Bibliography 75
Korean Abstract 80Docto
Recommended from our members
Improving Security and Performance in Low Latency Anonymous Networks
Conventional wisdom dictates that the level of anonymity offered by low latency anonymity networks increases as the user base grows. However, the most significant obstacle to increased adoption of such systems is that their security and performance properties are perceived to be weak. In an effort to help foster adoption, this dissertation aims to better understand and improve security, anonymity, and performance in low latency anonymous communication systems.
To better understand the security and performance properties of a popular low latency anonymity network, we characterize Tor, focusing on its application protocol distribution, geopolitical client and router distributions, and performance. For instance, we observe that peer-to-peer file sharing protocols use an unfair portion of the networkโs scarce bandwidth. To reduce the congestion produced by bulk downloaders in networks such as Tor, we design, implement, and analyze an anonymizing network tailored specifically for the BitTorrent peer-to-peer file sharing protocol. We next analyze Torโs security and anonymity properties and empirically show that Tor is vulnerable to practical end-to-end traffic correlation attacks launched by relatively weak adversaries that inflate their bandwidth claims to attract traffic and thereby compromise key positions on clientsโ paths. We also explore the security and performance trade-offs that revolve around path length design decisions and we show that shorter paths offer performance benefits and provide increased resilience to certain attacks. Finally, we discover a source of performance degradation in Tor that results from poor congestion and flow control. To improve Torโs performance and grow its user base, we offer a fresh approach to congestion and flow control inspired by techniques from IP and ATM networks