Search CORE

33,072 research outputs found

A Comparison of Blocking Methods for Record Linkage

Author: A. Goldenberg
D. Vatsalan
H. Liang
L. Paulevé
M. Kuzu
P. Christen
P. Christen
P. Christen
R. Hall
S. Fortunato
T. Herzog
Publication venue
Publication date: 01/01/2014
Field of study

Record linkage seeks to merge databases and to remove duplicates when unique identifiers are not available. Most approaches use blocking techniques to reduce the computational complexity associated with record linkage. We review traditional blocking techniques, which typically partition the records according to a set of field attributes, and consider two variants of a method known as locality sensitive hashing, sometimes referred to as "private blocking." We compare these approaches in terms of their recall, reduction ratio, and computational complexity. We evaluate these methods using different synthetic datafiles and conclude with a discussion of privacy-related issues.Comment: 22 pages, 2 tables, 7 figure

arXiv.org e-Print Archive

Crossref

Web Tracking: Mechanisms, Implications, and Defenses

Author: Barlet-Ros Pere
Bujlow Tomasz
Carela-Español Valentín
Solé-Pareta Josep
Publication venue
Publication date: 01/01/2015
Field of study

This articles surveys the existing literature on the methods currently used by web services to track the user online as well as their purposes, implications, and possible user's defenses. A significant majority of reviewed articles and web resources are from years 2012-2014. Privacy seems to be the Achilles' heel of today's web. Web services make continuous efforts to obtain as much information as they can about the things we search, the sites we visit, the people with who we contact, and the products we buy. Tracking is usually performed for commercial purposes. We present 5 main groups of methods used for user tracking, which are based on sessions, client storage, client cache, fingerprinting, or yet other approaches. A special focus is placed on mechanisms that use web caches, operational caches, and fingerprinting, as they are usually very rich in terms of using various creative methodologies. We also show how the users can be identified on the web and associated with their real names, e-mail addresses, phone numbers, or even street addresses. We show why tracking is being used and its possible implications for the users (price discrimination, assessing financial credibility, determining insurance coverage, government surveillance, and identity theft). For each of the tracking methods, we present possible defenses. Apart from describing the methods and tools used for keeping the personal data away from being tracked, we also present several tools that were used for research purposes - their main goal is to discover how and by which entity the users are being tracked on their desktop computers or smartphones, provide this information to the users, and visualize it in an accessible and easy to follow way. Finally, we present the currently proposed future approaches to track the user and show that they can potentially pose significant threats to the users' privacy.Comment: 29 pages, 212 reference

arXiv.org e-Print Archive

CiteSeerX

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

UPCommons. Portal del coneixement obert de la UPC

An Empirical Study of the I2P Anonymity Network and its Censorship Resistance

Author: Choffnes David
Conrad Bernd
Dingledine R.
Dingledine Roger
Dunna Arun
Fifield David
Grigg Jack
Lah Frederick
Nobori D
Project The Tor
Project The Tor
Schimmer Lars
Singh Rachee
Sun Yixin
Syverson P. F.
Tchabe Gildas Nya
Winter P
Zamani Mahdi
Zantout Bassam
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 25/09/2018
Field of study

Tor and I2P are well-known anonymity networks used by many individuals to protect their online privacy and anonymity. Tor's centralized directory services facilitate the understanding of the Tor network, as well as the measurement and visualization of its structure through the Tor Metrics project. In contrast, I2P does not rely on centralized directory servers, and thus obtaining a complete view of the network is challenging. In this work, we conduct an empirical study of the I2P network, in which we measure properties including population, churn rate, router type, and the geographic distribution of I2P peers. We find that there are currently around 32K active I2P peers in the network on a daily basis. Of these peers, 14K are located behind NAT or firewalls. Using the collected network data, we examine the blocking resistance of I2P against a censor that wants to prevent access to I2P using address-based blocking techniques. Despite the decentralized characteristics of I2P, we discover that a censor can block more than 95% of peer IP addresses known by a stable I2P client by operating only 10 routers in the network. This amounts to severe network impairment: a blocking rate of more than 70% is enough to cause significant latency in web browsing activities, while blocking more than 90% of peer IP addresses can make the network unusable. Finally, we discuss the security consequences of the network being blocked, and directions for potential approaches to make I2P more resistant to blocking.Comment: 14 pages, To appear in the 2018 Internet Measurement Conference (IMC'18

arXiv.org e-Print Archive

Crossref

Internet Censorship: An Integrative Review of Technologies Employed to Limit Access to the Internet, Monitor User Actions, and their Effects on Culture

Author: Hyland Joe
Publication venue: Scholars Crossing
Publication date: 24/04/2020
Field of study

The following conducts an integrative review of the current state of Internet Censorship in China, Iran, and Russia, highlights common circumvention technologies (CTs), and analyzes the effects Internet Censorship has on cultures. The author spends a large majority of the paper delineating China’s Internet infrastructure and prevalent Internet Censorship Technologies/Techniques (ICTs), paying particular attention to how the ICTs function at a technical level. The author further analyzes the state of Internet Censorship in both Iran and Russia from a broader perspective to give a better understanding of Internet Censorship around the globe. The author also highlights specific CTs, explaining how they function at a technical level. Findings indicate that among all three nation-states, state control of Internet Service Providers is the backbone of Internet Censorship. Specifically, within China, it is discovered that the infrastructure functions as an Intranet, thereby creating a closed system. Further, BGP Hijacking, DNS Poisoning, and TCP RST attacks are analyzed to understand their use-case within China. It is found that Iran functions much like a weaker version of China in regards to ICTs, with the state seemingly using the ICT of Bandwidth Throttling rather consistently. Russia’s approach to Internet censorship, in stark contrast to Iran and China, is found to rely mostly on the legislative system and fear to implement censorship, though their technical level of ICT implementation grows daily. TOR, VPNs, and Proxy Servers are all analyzed and found to be robust CTs. Drawing primarily from the examples given throughout the paper, the author highlights the various effects of Internet Censorship on culture – noting that at its core, Internet Censorship destroys democracy

Liberty University Digital Commons

OnionBots: Subverting Privacy Infrastructure for Cyber Attacks

Author: Noubir Guevara
Sanatinia Amirali
Publication venue
Publication date: 14/01/2015
Field of study

Over the last decade botnets survived by adopting a sequence of increasingly sophisticated strategies to evade detection and take overs, and to monetize their infrastructure. At the same time, the success of privacy infrastructures such as Tor opened the door to illegal activities, including botnets, ransomware, and a marketplace for drugs and contraband. We contend that the next waves of botnets will extensively subvert privacy infrastructure and cryptographic mechanisms. In this work we propose to preemptively investigate the design and mitigation of such botnets. We first, introduce OnionBots, what we believe will be the next generation of resilient, stealthy botnets. OnionBots use privacy infrastructures for cyber attacks by completely decoupling their operation from the infected host IP address and by carrying traffic that does not leak information about its source, destination, and nature. Such bots live symbiotically within the privacy infrastructures to evade detection, measurement, scale estimation, observation, and in general all IP-based current mitigation techniques. Furthermore, we show that with an adequate self-healing network maintenance scheme, that is simple to implement, OnionBots achieve a low diameter and a low degree and are robust to partitioning under node deletions. We developed a mitigation technique, called SOAP, that neutralizes the nodes of the basic OnionBots. We also outline and discuss a set of techniques that can enable subsequent waves of Super OnionBots. In light of the potential of such botnets, we believe that the research community should proactively develop detection and mitigation methods to thwart OnionBots, potentially making adjustments to privacy infrastructure.Comment: 12 pages, 8 figure

arXiv.org e-Print Archive

Crossref

Systematizing Decentralization and Privacy: Lessons from 15 Years of Research and Deployments

Author: Danezis George
Halpin Harry
Isaakidis Marios
Troncoso Carmela
Publication venue: 'Walter de Gruyter GmbH'
Publication date: 28/06/2017
Field of study

Decentralized systems are a subset of distributed systems where multiple authorities control different components and no authority is fully trusted by all. This implies that any component in a decentralized system is potentially adversarial. We revise fifteen years of research on decentralization and privacy, and provide an overview of key systems, as well as key insights for designers of future systems. We show that decentralized designs can enhance privacy, integrity, and availability but also require careful trade-offs in terms of system complexity, properties provided, and degree of decentralization. These trade-offs need to be understood and navigated by designers. We argue that a combination of insights from cryptography, distributed systems, and mechanism design, aligned with the development of adequate incentives, are necessary to build scalable and successful privacy-preserving decentralized systems

arXiv.org e-Print Archive

Crossref

INRIA a CCSD electronic archive server

UCL Discovery

Recommended from our members

UPC++ v1.0 Specification, Revision 2020.3.0

Author: Bachan J
Bonachea Dan
Kamil A
Publication venue: eScholarship, University of California
Publication date: 12/03/2020
Field of study

UPC++ is a C++11 library providing classes and functions that support Partitioned Global Address Space (PGAS) programming. The key communication facilities in UPC++ are one-sided Remote Memory Access (RMA) and Remote Procedure Call (RPC). All communication operations are syntactically explicit and default to non-blocking; asynchrony is managed through the use of futures, promises and continuation callbacks, enabling the programmer to construct a graph of operations to execute asynchronously as high-latency dependencies are satisfied. A global pointer abstraction provides system-wide addressability of shared memory, including host and accelerator memories. The parallelism model is primarily process-based, but the interface is thread-safe and designed to allow efficient and expressive use in multi-threaded applications. The interface is designed for extreme scalability throughout, and deliberately avoids design features that could inhibit scalability

eScholarship - University of California