33,072 research outputs found
A Comparison of Blocking Methods for Record Linkage
Record linkage seeks to merge databases and to remove duplicates when unique
identifiers are not available. Most approaches use blocking techniques to
reduce the computational complexity associated with record linkage. We review
traditional blocking techniques, which typically partition the records
according to a set of field attributes, and consider two variants of a method
known as locality sensitive hashing, sometimes referred to as "private
blocking." We compare these approaches in terms of their recall, reduction
ratio, and computational complexity. We evaluate these methods using different
synthetic datafiles and conclude with a discussion of privacy-related issues.Comment: 22 pages, 2 tables, 7 figure
Web Tracking: Mechanisms, Implications, and Defenses
This articles surveys the existing literature on the methods currently used
by web services to track the user online as well as their purposes,
implications, and possible user's defenses. A significant majority of reviewed
articles and web resources are from years 2012-2014. Privacy seems to be the
Achilles' heel of today's web. Web services make continuous efforts to obtain
as much information as they can about the things we search, the sites we visit,
the people with who we contact, and the products we buy. Tracking is usually
performed for commercial purposes. We present 5 main groups of methods used for
user tracking, which are based on sessions, client storage, client cache,
fingerprinting, or yet other approaches. A special focus is placed on
mechanisms that use web caches, operational caches, and fingerprinting, as they
are usually very rich in terms of using various creative methodologies. We also
show how the users can be identified on the web and associated with their real
names, e-mail addresses, phone numbers, or even street addresses. We show why
tracking is being used and its possible implications for the users (price
discrimination, assessing financial credibility, determining insurance
coverage, government surveillance, and identity theft). For each of the
tracking methods, we present possible defenses. Apart from describing the
methods and tools used for keeping the personal data away from being tracked,
we also present several tools that were used for research purposes - their main
goal is to discover how and by which entity the users are being tracked on
their desktop computers or smartphones, provide this information to the users,
and visualize it in an accessible and easy to follow way. Finally, we present
the currently proposed future approaches to track the user and show that they
can potentially pose significant threats to the users' privacy.Comment: 29 pages, 212 reference
An Empirical Study of the I2P Anonymity Network and its Censorship Resistance
Tor and I2P are well-known anonymity networks used by many individuals to
protect their online privacy and anonymity. Tor's centralized directory
services facilitate the understanding of the Tor network, as well as the
measurement and visualization of its structure through the Tor Metrics project.
In contrast, I2P does not rely on centralized directory servers, and thus
obtaining a complete view of the network is challenging. In this work, we
conduct an empirical study of the I2P network, in which we measure properties
including population, churn rate, router type, and the geographic distribution
of I2P peers. We find that there are currently around 32K active I2P peers in
the network on a daily basis. Of these peers, 14K are located behind NAT or
firewalls.
Using the collected network data, we examine the blocking resistance of I2P
against a censor that wants to prevent access to I2P using address-based
blocking techniques. Despite the decentralized characteristics of I2P, we
discover that a censor can block more than 95% of peer IP addresses known by a
stable I2P client by operating only 10 routers in the network. This amounts to
severe network impairment: a blocking rate of more than 70% is enough to cause
significant latency in web browsing activities, while blocking more than 90% of
peer IP addresses can make the network unusable. Finally, we discuss the
security consequences of the network being blocked, and directions for
potential approaches to make I2P more resistant to blocking.Comment: 14 pages, To appear in the 2018 Internet Measurement Conference
(IMC'18
Internet Censorship: An Integrative Review of Technologies Employed to Limit Access to the Internet, Monitor User Actions, and their Effects on Culture
The following conducts an integrative review of the current state of Internet Censorship in China, Iran, and Russia, highlights common circumvention technologies (CTs), and analyzes the effects Internet Censorship has on cultures. The author spends a large majority of the paper delineating China’s Internet infrastructure and prevalent Internet Censorship Technologies/Techniques (ICTs), paying particular attention to how the ICTs function at a technical level. The author further analyzes the state of Internet Censorship in both Iran and Russia from a broader perspective to give a better understanding of Internet Censorship around the globe. The author also highlights specific CTs, explaining how they function at a technical level. Findings indicate that among all three nation-states, state control of Internet Service Providers is the backbone of Internet Censorship. Specifically, within China, it is discovered that the infrastructure functions as an Intranet, thereby creating a closed system. Further, BGP Hijacking, DNS Poisoning, and TCP RST attacks are analyzed to understand their use-case within China. It is found that Iran functions much like a weaker version of China in regards to ICTs, with the state seemingly using the ICT of Bandwidth Throttling rather consistently. Russia’s approach to Internet censorship, in stark contrast to Iran and China, is found to rely mostly on the legislative system and fear to implement censorship, though their technical level of ICT implementation grows daily. TOR, VPNs, and Proxy Servers are all analyzed and found to be robust CTs. Drawing primarily from the examples given throughout the paper, the author highlights the various effects of Internet Censorship on culture – noting that at its core, Internet Censorship destroys democracy
OnionBots: Subverting Privacy Infrastructure for Cyber Attacks
Over the last decade botnets survived by adopting a sequence of increasingly
sophisticated strategies to evade detection and take overs, and to monetize
their infrastructure. At the same time, the success of privacy infrastructures
such as Tor opened the door to illegal activities, including botnets,
ransomware, and a marketplace for drugs and contraband. We contend that the
next waves of botnets will extensively subvert privacy infrastructure and
cryptographic mechanisms. In this work we propose to preemptively investigate
the design and mitigation of such botnets. We first, introduce OnionBots, what
we believe will be the next generation of resilient, stealthy botnets.
OnionBots use privacy infrastructures for cyber attacks by completely
decoupling their operation from the infected host IP address and by carrying
traffic that does not leak information about its source, destination, and
nature. Such bots live symbiotically within the privacy infrastructures to
evade detection, measurement, scale estimation, observation, and in general all
IP-based current mitigation techniques. Furthermore, we show that with an
adequate self-healing network maintenance scheme, that is simple to implement,
OnionBots achieve a low diameter and a low degree and are robust to
partitioning under node deletions. We developed a mitigation technique, called
SOAP, that neutralizes the nodes of the basic OnionBots. We also outline and
discuss a set of techniques that can enable subsequent waves of Super
OnionBots. In light of the potential of such botnets, we believe that the
research community should proactively develop detection and mitigation methods
to thwart OnionBots, potentially making adjustments to privacy infrastructure.Comment: 12 pages, 8 figure
Systematizing Decentralization and Privacy: Lessons from 15 Years of Research and Deployments
Decentralized systems are a subset of distributed systems where multiple
authorities control different components and no authority is fully trusted by
all. This implies that any component in a decentralized system is potentially
adversarial. We revise fifteen years of research on decentralization and
privacy, and provide an overview of key systems, as well as key insights for
designers of future systems. We show that decentralized designs can enhance
privacy, integrity, and availability but also require careful trade-offs in
terms of system complexity, properties provided, and degree of
decentralization. These trade-offs need to be understood and navigated by
designers. We argue that a combination of insights from cryptography,
distributed systems, and mechanism design, aligned with the development of
adequate incentives, are necessary to build scalable and successful
privacy-preserving decentralized systems
Recommended from our members
UPC++ v1.0 Specification, Revision 2020.3.0
UPC++ is a C++11 library providing classes and functions that support Partitioned Global Address Space (PGAS) programming. The key communication facilities in UPC++ are one-sided Remote Memory Access (RMA) and Remote Procedure Call (RPC). All communication operations are syntactically explicit and default to non-blocking; asynchrony is managed through the use of futures, promises and continuation callbacks, enabling the programmer to construct a graph of operations to execute asynchronously as high-latency dependencies are satisfied. A global pointer abstraction provides system-wide addressability of shared memory, including host and accelerator memories. The parallelism model is primarily process-based, but the interface is thread-safe and designed to allow efficient and expressive use in multi-threaded applications. The interface is designed for extreme scalability throughout, and deliberately avoids design features that could inhibit scalability
- …