14,156 research outputs found
FilteredWeb: A Framework for the Automated Search-Based Discovery of Blocked URLs
Various methods have been proposed for creating and maintaining lists of
potentially filtered URLs to allow for measurement of ongoing internet
censorship around the world. Whilst testing a known resource for evidence of
filtering can be relatively simple, given appropriate vantage points,
discovering previously unknown filtered web resources remains an open
challenge.
We present a new framework for automating the process of discovering filtered
resources through the use of adaptive queries to well-known search engines. Our
system applies information retrieval algorithms to isolate characteristic
linguistic patterns in known filtered web pages; these are then used as the
basis for web search queries. The results of these queries are then checked for
evidence of filtering, and newly discovered filtered resources are fed back
into the system to detect further filtered content.
Our implementation of this framework, applied to China as a case study, shows
that this approach is demonstrably effective at detecting significant numbers
of previously unknown filtered web pages, making a significant contribution to
the ongoing detection of internet filtering as it develops.
Our tool is currently deployed and has been used to discover 1355 domains
that are poisoned within China as of Feb 2017 - 30 times more than are
contained in the most widely-used public filter list. Of these, 759 are outside
of the Alexa Top 1000 domains list, demonstrating the capability of this
framework to find more obscure filtered content. Further, our initial analysis
of filtered URLs, and the search terms that were used to discover them, gives
further insight into the nature of the content currently being blocked in
China.Comment: To appear in "Network Traffic Measurement and Analysis Conference
2017" (TMA2017
Automated Discovery of Internet Censorship by Web Crawling
Censorship of the Internet is widespread around the world. As access to the
web becomes increasingly ubiquitous, filtering of this resource becomes more
pervasive. Transparency about specific content that citizens are denied access
to is atypical. To counter this, numerous techniques for maintaining URL filter
lists have been proposed by various individuals and organisations that aim to
empirical data on censorship for benefit of the public and wider censorship
research community.
We present a new approach for discovering filtered domains in different
countries. This method is fully automated and requires no human interaction.
The system uses web crawling techniques to traverse between filtered sites and
implements a robust method for determining if a domain is filtered. We
demonstrate the effectiveness of the approach by running experiments to search
for filtered content in four different censorship regimes. Our results show
that we perform better than the current state of the art and have built domain
filter lists an order of magnitude larger than the most widely available public
lists as of Jan 2018. Further, we build a dataset mapping the interlinking
nature of blocked content between domains and exhibit the tightly networked
nature of censored web resources
Encore: Lightweight Measurement of Web Censorship with Cross-Origin Requests
Despite the pervasiveness of Internet censorship, we have scant data on its
extent, mechanisms, and evolution. Measuring censorship is challenging: it
requires continual measurement of reachability to many target sites from
diverse vantage points. Amassing suitable vantage points for longitudinal
measurement is difficult; existing systems have achieved only small,
short-lived deployments. We observe, however, that most Internet users access
content via Web browsers, and the very nature of Web site design allows
browsers to make requests to domains with different origins than the main Web
page. We present Encore, a system that harnesses cross-origin requests to
measure Web filtering from a diverse set of vantage points without requiring
users to install custom software, enabling longitudinal measurements from many
vantage points. We explain how Encore induces Web clients to perform
cross-origin requests that measure Web filtering, design a distributed platform
for scheduling and collecting these measurements, show the feasibility of a
global-scale deployment with a pilot study and an analysis of potentially
censored Web content, identify several cases of filtering in six months of
measurements, and discuss ethical concerns that would arise with widespread
deployment
Ethics and Internet Measurements
Over the past decade the Internet has changed from a helpful tool to an important part of our daily lives for most of the world’s population. Where in the past the Internet mostly served to look up and exchange information, it is now used to stay in touch with friends, perform financial transactions or exchange other kinds of sensitive information. This development impacts researchers performing Internet measurements, as the data traffic they collect is now much more likely to have some impact on users. Traditional institutions such as Institutional Review Boards (IRBs) or Ethics Committees are not always equipped to perform a thorough review or gauge the impact of Internet measurement studies. This paper examines the impact of this development for Internet measurements and analyses previous cases where Internet measurements have touched upon ethical issues. The paper proposes an early framework to help researchers identify stakeholders and how a network study may impact them. In addition to this, the paper provides advice on creating measurement practices that incorporate ethics by design, and also considers the role of third-party data suppliers in ethical measurement practices
Understanding the Impact of Encrypted DNS on Internet Censorship
DNS traffic is transmitted in plaintext, resulting in privacy leakage. To combat this problem, secure protocols have been used to encrypt DNS messages. Existing studies have investigated the performance overhead and privacy benefits of encrypted DNS communications, yet little has been done from the perspective of censorship. In this paper, we study the impact of the encrypted DNS on Internet censorship in two aspects. On one hand, we explore the severity of DNS manipulation, which could be leveraged for Internet censorship, given the use of encrypted DNS resolvers. In particular, we perform 7.4 million DNS lookup measurements on 3,813 DoT and 75 DoH resolvers and identify that 1.66% of DoT responses and 1.42% of DoH responses undergo DNS manipulation. More importantly, we observe that more than two-thirds of the DoT and DoH resolvers manipulate DNS responses from at least one domain, indicating that the DNS manipulation is prevalent in encrypted DNS, which can be further exploited for enhancing Internet censorship. On the other hand, we evaluate the effectiveness of using encrypted DNS resolvers for censorship circumvention. Specifically, we first discover those vantage points that involve DNS manipulation through on-path devices, and then we apply encrypted DNS resolvers at these vantage points to access the censored domains. We reveal that 37% of the domains are accessible from the vantage points in China, but none of the domains is accessible from the vantage points in Iran, indicating that the censorship circumvention of using encrypted DNS resolvers varies from country to country. Moreover, for a vantage point, using a different encrypted DNS resolver does not lead to a noticeable difference in accessing the censored domains
Impact of Geo-distribution and Mining Pools on Blockchains: A Study of Ethereum
Given the large adoption and economical impact of permissionless blockchains,
the complexity of the underlying systems and the adversarial environment in
which they operate, it is fundamental to properly study and understand the
emergent behavior and properties of these systems. We describe our experience
on a detailed, one-month study of the Ethereum network from several
geographically dispersed observation points. We leverage multiple geographic
vantage points to assess the key pillars of Ethereum, namely geographical
dispersion, network efficiency, blockchain efficiency and security, and the
impact of mining pools. Among other new findings, we identify previously
undocumented forms of selfish behavior and show that the prevalence of powerful
mining pools exacerbates the geographical impact on block propagation delays.
Furthermore, we provide a set of open measurement and processing tools, as well
as the data set of the collected measurements, in order to promote further
research on understanding permissionless blockchains.Comment: To appear in 50th IEEE/IFIP International Conference on Dependable
Systems and Networks (DSN), 202
Corporate Social Responsibility: an honest duplicity
Business activity - which is dominated by corporations - through the provision of investment, jobs and tax payments, is central to the provision and protection of Human Rights. Simultaneously, there is copious evidence that business activity is a direct source of Human Rights violations and undermines numerous States ability to protect and provide Human Rights. Hence governments face a tension between encouraging investment and asserting authority over business activity to limit corporate excess and ensure business works for rather than against humanity. The challenge in the globalised world is how governments can best assert that authority. This essay will contend that a voluntary approach through Corporate Social Responsibility is currently the dominant approach to limiting corporate excess but will argue this approach is fundamentally flawed and cannot be relied upon to protect and enhance the provision of Human Rights
Recommended from our members
When users control the algorithms: Values expressed in practices on the twitter platform
Recent interest in ethical AI has brought a slew of values, including fairness, into conversations about technology design. Research in the area of algorithmic fairness tends to be rooted in questions of distribution that can be subject to precise formalism and technical implementation. We seek to expand this conversation to include the experiences of people subject to algorithmic classification and decision-making. By examining tweets about the “Twitter algorithm” we consider the wide range of concerns and desires Twitter users express. We find a concern with fairness (narrowly construed) is present, particularly in the ways users complain that the platform enacts a political bias against conservatives. However, we find another important category of concern, evident in attempts to exert control over the algorithm. Twitter users who seek control do so for a variety of reasons, many well justified. We argue for the need for better and clearer definitions of what constitutes legitimate and illegitimate control over algorithmic processes and to consider support for users who wish to enact their own collective choices
- …