33,072 research outputs found

    A Comparison of Blocking Methods for Record Linkage

    Full text link
    Record linkage seeks to merge databases and to remove duplicates when unique identifiers are not available. Most approaches use blocking techniques to reduce the computational complexity associated with record linkage. We review traditional blocking techniques, which typically partition the records according to a set of field attributes, and consider two variants of a method known as locality sensitive hashing, sometimes referred to as "private blocking." We compare these approaches in terms of their recall, reduction ratio, and computational complexity. We evaluate these methods using different synthetic datafiles and conclude with a discussion of privacy-related issues.Comment: 22 pages, 2 tables, 7 figure

    Web Tracking: Mechanisms, Implications, and Defenses

    Get PDF
    This articles surveys the existing literature on the methods currently used by web services to track the user online as well as their purposes, implications, and possible user's defenses. A significant majority of reviewed articles and web resources are from years 2012-2014. Privacy seems to be the Achilles' heel of today's web. Web services make continuous efforts to obtain as much information as they can about the things we search, the sites we visit, the people with who we contact, and the products we buy. Tracking is usually performed for commercial purposes. We present 5 main groups of methods used for user tracking, which are based on sessions, client storage, client cache, fingerprinting, or yet other approaches. A special focus is placed on mechanisms that use web caches, operational caches, and fingerprinting, as they are usually very rich in terms of using various creative methodologies. We also show how the users can be identified on the web and associated with their real names, e-mail addresses, phone numbers, or even street addresses. We show why tracking is being used and its possible implications for the users (price discrimination, assessing financial credibility, determining insurance coverage, government surveillance, and identity theft). For each of the tracking methods, we present possible defenses. Apart from describing the methods and tools used for keeping the personal data away from being tracked, we also present several tools that were used for research purposes - their main goal is to discover how and by which entity the users are being tracked on their desktop computers or smartphones, provide this information to the users, and visualize it in an accessible and easy to follow way. Finally, we present the currently proposed future approaches to track the user and show that they can potentially pose significant threats to the users' privacy.Comment: 29 pages, 212 reference

    An Empirical Study of the I2P Anonymity Network and its Censorship Resistance

    Full text link
    Tor and I2P are well-known anonymity networks used by many individuals to protect their online privacy and anonymity. Tor's centralized directory services facilitate the understanding of the Tor network, as well as the measurement and visualization of its structure through the Tor Metrics project. In contrast, I2P does not rely on centralized directory servers, and thus obtaining a complete view of the network is challenging. In this work, we conduct an empirical study of the I2P network, in which we measure properties including population, churn rate, router type, and the geographic distribution of I2P peers. We find that there are currently around 32K active I2P peers in the network on a daily basis. Of these peers, 14K are located behind NAT or firewalls. Using the collected network data, we examine the blocking resistance of I2P against a censor that wants to prevent access to I2P using address-based blocking techniques. Despite the decentralized characteristics of I2P, we discover that a censor can block more than 95% of peer IP addresses known by a stable I2P client by operating only 10 routers in the network. This amounts to severe network impairment: a blocking rate of more than 70% is enough to cause significant latency in web browsing activities, while blocking more than 90% of peer IP addresses can make the network unusable. Finally, we discuss the security consequences of the network being blocked, and directions for potential approaches to make I2P more resistant to blocking.Comment: 14 pages, To appear in the 2018 Internet Measurement Conference (IMC'18

    Internet Censorship: An Integrative Review of Technologies Employed to Limit Access to the Internet, Monitor User Actions, and their Effects on Culture

    Get PDF
    The following conducts an integrative review of the current state of Internet Censorship in China, Iran, and Russia, highlights common circumvention technologies (CTs), and analyzes the effects Internet Censorship has on cultures. The author spends a large majority of the paper delineating China’s Internet infrastructure and prevalent Internet Censorship Technologies/Techniques (ICTs), paying particular attention to how the ICTs function at a technical level. The author further analyzes the state of Internet Censorship in both Iran and Russia from a broader perspective to give a better understanding of Internet Censorship around the globe. The author also highlights specific CTs, explaining how they function at a technical level. Findings indicate that among all three nation-states, state control of Internet Service Providers is the backbone of Internet Censorship. Specifically, within China, it is discovered that the infrastructure functions as an Intranet, thereby creating a closed system. Further, BGP Hijacking, DNS Poisoning, and TCP RST attacks are analyzed to understand their use-case within China. It is found that Iran functions much like a weaker version of China in regards to ICTs, with the state seemingly using the ICT of Bandwidth Throttling rather consistently. Russia’s approach to Internet censorship, in stark contrast to Iran and China, is found to rely mostly on the legislative system and fear to implement censorship, though their technical level of ICT implementation grows daily. TOR, VPNs, and Proxy Servers are all analyzed and found to be robust CTs. Drawing primarily from the examples given throughout the paper, the author highlights the various effects of Internet Censorship on culture – noting that at its core, Internet Censorship destroys democracy

    OnionBots: Subverting Privacy Infrastructure for Cyber Attacks

    Full text link
    Over the last decade botnets survived by adopting a sequence of increasingly sophisticated strategies to evade detection and take overs, and to monetize their infrastructure. At the same time, the success of privacy infrastructures such as Tor opened the door to illegal activities, including botnets, ransomware, and a marketplace for drugs and contraband. We contend that the next waves of botnets will extensively subvert privacy infrastructure and cryptographic mechanisms. In this work we propose to preemptively investigate the design and mitigation of such botnets. We first, introduce OnionBots, what we believe will be the next generation of resilient, stealthy botnets. OnionBots use privacy infrastructures for cyber attacks by completely decoupling their operation from the infected host IP address and by carrying traffic that does not leak information about its source, destination, and nature. Such bots live symbiotically within the privacy infrastructures to evade detection, measurement, scale estimation, observation, and in general all IP-based current mitigation techniques. Furthermore, we show that with an adequate self-healing network maintenance scheme, that is simple to implement, OnionBots achieve a low diameter and a low degree and are robust to partitioning under node deletions. We developed a mitigation technique, called SOAP, that neutralizes the nodes of the basic OnionBots. We also outline and discuss a set of techniques that can enable subsequent waves of Super OnionBots. In light of the potential of such botnets, we believe that the research community should proactively develop detection and mitigation methods to thwart OnionBots, potentially making adjustments to privacy infrastructure.Comment: 12 pages, 8 figure

    Systematizing Decentralization and Privacy: Lessons from 15 Years of Research and Deployments

    Get PDF
    Decentralized systems are a subset of distributed systems where multiple authorities control different components and no authority is fully trusted by all. This implies that any component in a decentralized system is potentially adversarial. We revise fifteen years of research on decentralization and privacy, and provide an overview of key systems, as well as key insights for designers of future systems. We show that decentralized designs can enhance privacy, integrity, and availability but also require careful trade-offs in terms of system complexity, properties provided, and degree of decentralization. These trade-offs need to be understood and navigated by designers. We argue that a combination of insights from cryptography, distributed systems, and mechanism design, aligned with the development of adequate incentives, are necessary to build scalable and successful privacy-preserving decentralized systems
    • …
    corecore