6 research outputs found
FilteredWeb: A Framework for the Automated Search-Based Discovery of Blocked URLs
Various methods have been proposed for creating and maintaining lists of
potentially filtered URLs to allow for measurement of ongoing internet
censorship around the world. Whilst testing a known resource for evidence of
filtering can be relatively simple, given appropriate vantage points,
discovering previously unknown filtered web resources remains an open
challenge.
We present a new framework for automating the process of discovering filtered
resources through the use of adaptive queries to well-known search engines. Our
system applies information retrieval algorithms to isolate characteristic
linguistic patterns in known filtered web pages; these are then used as the
basis for web search queries. The results of these queries are then checked for
evidence of filtering, and newly discovered filtered resources are fed back
into the system to detect further filtered content.
Our implementation of this framework, applied to China as a case study, shows
that this approach is demonstrably effective at detecting significant numbers
of previously unknown filtered web pages, making a significant contribution to
the ongoing detection of internet filtering as it develops.
Our tool is currently deployed and has been used to discover 1355 domains
that are poisoned within China as of Feb 2017 - 30 times more than are
contained in the most widely-used public filter list. Of these, 759 are outside
of the Alexa Top 1000 domains list, demonstrating the capability of this
framework to find more obscure filtered content. Further, our initial analysis
of filtered URLs, and the search terms that were used to discover them, gives
further insight into the nature of the content currently being blocked in
China.Comment: To appear in "Network Traffic Measurement and Analysis Conference
2017" (TMA2017
How India Censors the Web
One of the primary ways in which India engages in online censorship is by
ordering Internet Service Providers (ISPs) operating in its jurisdiction to
block access to certain websites for its users. This paper reports the
different techniques Indian ISPs are using to censor websites, and investigates
whether website blocklists are consistent across ISPs. We propose a suite of
tests that prove more robust than previous work in detecting DNS and HTTP based
censorship. Our tests also discern the use of SNI inspection for blocking
websites, which is previously undocumented in the Indian context. Using
information from court orders, user reports, and public and leaked government
orders, we compile the largest known list of potentially blocked websites in
India. We pass this list to our tests and run them from connections of six
different ISPs, which together serve more than 98% of Internet users in India.
Our findings not only confirm that ISPs are using different techniques to block
websites, but also demonstrate that different ISPs are not blocking the same
websites
Internet measurements and policy
Ideally telecommunications policy decisions would be based on easily understandable data collected by several, federated, independent, open-source network measurement tools based on documented methodologies. In reality, most measurement tools are fragmented to such an extent that their data are limited in comparability and not immediately accessible to inform policy. This paper tries to understand how we could improve the current situation. We describe how successful internet measurement tools managed to foster their adoption by users. We argue that cooperation is the key to reduce the burden on individual project maintainers and we model factors reducing incentives to cooperate. We set forth an agenda for increasing cooperation and we provide examples of cooperation between projects