516 research outputs found
Information Leakage Attacks and Countermeasures
The scientific community has been consistently working on the pervasive problem of information leakage, uncovering numerous attack vectors, and proposing various countermeasures. Despite these efforts, leakage incidents remain prevalent, as the complexity of systems and protocols increases, and sophisticated modeling methods become more accessible to adversaries. This work studies how information leakages manifest in and impact interconnected systems and their users. We first focus on online communications and investigate leakages in the Transport Layer Security protocol (TLS). Using modern machine learning models, we show that an eavesdropping adversary can efficiently exploit meta-information (e.g., packet size) not protected by the TLS’ encryption to launch fingerprinting attacks at an unprecedented scale even under non-optimal conditions. We then turn our attention to ultrasonic communications, and discuss their security shortcomings and how adversaries could exploit them to compromise anonymity network users (even though they aim to offer a greater level of privacy compared to TLS). Following up on these, we delve into physical layer leakages that concern a wide array of (networked) systems such as servers, embedded nodes, Tor relays, and hardware cryptocurrency wallets. We revisit location-based side-channel attacks and develop an exploitation neural network. Our model demonstrates the capabilities of a modern adversary but also presents an inexpensive tool to be used by auditors for detecting such leakages early on during the development cycle. Subsequently, we investigate techniques that further minimize the impact of leakages found in production components. Our proposed system design distributes both the custody of secrets and the cryptographic operation execution across several components, thus making the exploitation of leaks difficult
Browser fingerprinting: how to protect machine learning models and data with differential privacy?
As modern communication networks grow more and more complex, manually maintaining an overview of deployed soft- and hardware is challenging. Mechanisms such as fingerprinting are utilized to automatically extract information from ongoing network traffic and map this to a specific device or application, e.g., a browser. Active approaches directly interfere with the traffic and impose security risks or are simply infeasible. Therefore, passive approaches are employed, which only monitor traffic but require a well-designed feature set since less information is available. However, even these passive approaches impose privacy risks. Browser identification from encrypted traffic may lead to data leakage, e.g., the browser history of users. We propose a passive browser fingerprinting method based on explainable features and evaluate two privacy protection mechanisms, namely differentially private classifiers and differentially private data generation. With a differentially private Random Decision Forest, we achieve an accuracy of 0.877. If we train a non-private Random Forest on differentially private synthetic data, we reach an accuracy up to 0.887, showing a reasonable trade-off between utility and privacy
Browser Fingerprinting: How to Protect Machine Learning Models and Data with Differential Privacy?
As modern communication networks grow more and more complex, manually maintaining an overview of deployed soft- and hardware is challenging. Mechanisms such as fingerprinting are utilized to automatically extract information from ongoing network traffic and map this to a specific device or application, e.g., a browser. Active approaches directly interfere with the traffic and impose security risks or are simply infeasible. Therefore, passive approaches are employed, which only monitor traffic but require a well-designed feature set since less information is available. However, even these passive approaches impose privacy risks. Browser identification from encrypted traffic may lead to data leakage, e.g., the browser history of users. We propose a passive browser fingerprinting method based on explainable features and evaluate two privacy protection mechanisms, namely differentially private classifiers and differentially private data generation. With a differentially private Random Decision Forest, we achieve an accuracy of 0.877. If we train a non-private Random Forest on differentially private synthetic data, we reach an accuracy up to 0.887, showing a reasonable trade-off between utility and privacy
Privacy Analysis of Online and Offline Systems
How to protect people's privacy when our life are banded together with smart devices online and offline? For offline systems like smartphones, we often have a passcode to prevent others accessing to our personal data. Shoulder-surfing attacks to predict the passcode by humans are shown to not be accurate. We thus propose an automated algorithm to accurately predict the passcode entered by a victim on her smartphone by recording the video. Our proposed algorithm is able to predict over 92% of numbers entered in fewer than 75 seconds with training performed once.For online systems like surfing on Internet, anonymous communications networks like Tor can help encrypting the traffic data to reduce the possibility of losing our privacy. Each Tor client telescopically builds a circuit by choosing three Tor relays and then uses that circuit to connect to a server. The Tor relay selection algorithm makes sure that no two relays with the same /16 IP address or Autonomous System (AS) are chosen. Our objective is to determine the popularity of Tor relays when building circuits. With over 44 vantage points and over 145,000 circuits built, we found that some Tor relays are chosen more often than others. Although a completely balanced selection algorithm is not possible, analysis of our dataset shows that some Tor relays are over 3 times more likely to be chosen than others. An adversary could potentially eavesdrop or correlate more Tor traffic.Further more, the effectiveness of website fingerprinting (WF) has been shown to have an accuracy of over 90% when using Tor as the anonymity network. The common assumption in previous work is that a victim is visiting one website at a time and has access to the complete network trace of that website. Our main concern about website fingerprinting is its practicality. Victims could visit another website in the middle of visiting one website (overlapping visits). Or an adversary may only get an incomplete network traffic trace. When two website visits are overlapping, the website fingerprinting accuracy falls dramatically. Using our proposed "sectioning" algorithm, the accuracy for predicting the website in overlapping visits improves from 22.80% to 70%. When part of the network trace is missing (either the beginning or the end), the accuracy when using our sectioning algorithm increases from 20% to over 60%
Recommended from our members
TOWARDS RELIABLE CIRCUMVENTION OF INTERNET CENSORSHIP
The Internet plays a crucial role in today\u27s social and political movements by facilitating the free circulation of speech, information, and ideas; democracy and human rights throughout the world critically depend on preserving and bolstering the Internet\u27s openness. Consequently, repressive regimes, totalitarian governments, and corrupt corporations regulate, monitor, and restrict the access to the Internet, which is broadly known as Internet \emph{censorship}. Most countries are improving the internet infrastructures, as a result they can implement more advanced censoring techniques. Also with the advancements in the application of machine learning techniques for network traffic analysis have enabled the more sophisticated Internet censorship. In this thesis, We take a close look at the main pillars of internet censorship, we will introduce new defense and attacks in the internet censorship literature.
Internet censorship techniques investigate users’ communications and they can decide to interrupt a connection to prevent a user from communicating with a specific entity. Traffic analysis is one of the main techniques used to infer information from internet communications. One of the major challenges to traffic analysis mechanisms is scaling the techniques to today\u27s exploding volumes of network traffic, i.e., they impose high storage, communications, and computation overheads. We aim at addressing this scalability issue by introducing a new direction for traffic analysis, which we call \emph{compressive traffic analysis}. Moreover, we show that, unfortunately, traffic analysis attacks can be conducted on Anonymity systems with drastically higher accuracies than before by leveraging emerging learning mechanisms. We particularly design a system, called \deepcorr, that outperforms the state-of-the-art by significant margins in correlating network connections. \deepcorr leverages an advanced deep learning architecture to \emph{learn} a flow correlation function tailored to complex networks. Also to be able to analyze the weakness of such approaches we show that an adversary can defeat deep neural network based traffic analysis techniques by applying statistically undetectable \emph{adversarial perturbations} on the patterns of live network traffic.
We also design techniques to circumvent internet censorship. Decoy routing is an emerging approach for censorship circumvention in which circumvention is implemented with help from a number of volunteer Internet autonomous systems, called decoy ASes. We propose a new architecture for decoy routing that, by design, is significantly stronger to rerouting attacks compared to \emph{all} previous designs. Unlike previous designs, our new architecture operates decoy routers only on the downstream traffic of the censored users; therefore we call it \emph{downstream-only} decoy routing. As we demonstrate through Internet-scale BGP simulations, downstream-only decoy routing offers significantly stronger resistance to rerouting attacks, which is intuitively because a (censoring) ISP has much less control on the downstream BGP routes of its traffic. Then, we propose to use game theoretic approaches to model the arms races between the censors and the censorship circumvention tools. This will allow us to analyze the effect of different parameters or censoring behaviors on the performance of censorship circumvention tools. We apply our methods on two fundamental problems in internet censorship.
Finally, to bring our ideas to practice, we designed a new censorship circumvention tool called \name. \name aims at increasing the collateral damage of censorship by employing a ``mass\u27\u27 of normal Internet users, from both censored and uncensored areas, to serve as circumvention proxies
Towards More Effective Traffic Analysis in the Tor Network.
University of Minnesota Ph.D. dissertation. February 2021. Major: Computer Science. Advisor: Nicholas Hopper. 1 computer file (PDF); xiii, 161 pages.Tor is perhaps the most well-known anonymous network, used by millions of daily users to hide their sensitive internet activities from servers, ISPs, and potentially, nation-state adversaries. Tor provides low-latency anonymity by routing traffic through a series of relays using layered encryption to prevent any single entity from learning the source and destination of a connection through the content alone. Nevertheless, in low-latency anonymity networks, the timing and volume of traffic sent between the network and end systems (clients and servers) can be used for traffic analysis. For example, recent work applying traffic analysis to Tor has focused on website fingerprinting, which can allow an attacker to identify which website a client has downloaded based on the traffic between the client and the entry relay. Along with website fingerprinting, end-to-end flow correlation attacks have been recognized as the core traffic analysis in Tor. This attack assumes that an adversary observes traffic flows entering the network (Tor flow) and leaving the network (exit flow) and attempts to correlate these flows by pairing each user with a likely destination. The research in this thesis explores the extent to which the traffic analysis technique can be applied to more sophisticated fingerprinting scenarios using state-of-the-art machine-learning algorithms and deep learning techniques. The thesis breaks down four research problems. First, the applicability of machine-learning-based website fingerprinting is examined to a search query keyword fingerprinting and improve the applicability by discovering new features. Second, a variety of fingerprinting applications are introduced using deep-learning-based website fingerprinting. Third, the work presents data-limited fingerprinting by leveraging a generative deep-learning technique called a generative adversarial network that can be optimized in scenarios with limited amounts of training data. Lastly, a novel deep-learning architecture and training strategy are proposed to extract features of highly correlated Tor and exit flow pairs, which will reduce the number of false positives between pairs of flows
FP-Redemption: Studying Browser Fingerprinting Adoption for the Sake of Web Security
International audienceBrowser fingerprinting has established itself as a stateless technique to identify users on the Web. In particular, it is a highly criticized technique to track users. However, we believe that this identification technique can serve more virtuous purposes, such as bot detection or multi-factor authentication. In this paper, we explore the adoption of browser fingerprinting for security-oriented purposes. More specifically, we study 4 types of web pages that require security mechanisms to process user data: sign-up, sign-in, basket and payment pages. We visited 1, 485 pages on 446 domains and we identified the acquisition of browser fingerprints from 405 pages. By using an existing classification technique, we identified 169 distinct browser fingerprinting scripts included in these pages. By investigating the origins of the browser fingerprinting scripts, we identified 12 security-oriented organizations who collect browser fingerprints on sign-up, sign-in, and payment pages. Finally, we assess the effectiveness of browser fingerprinting against two potential attacks, namely stolen credentials and cookie hijacking. We observe browser fingerprinting being successfully used to enhance web security
The Price of Privacy - An Evaluation of the Economic Value of Collecting Clickstream Data
The analysis of clickstream data facilitates the understanding and prediction of customer behavior in e-commerce. Companies can leverage such data to increase revenue. For customers and website users, on the other hand, the collection of behavioral data entails privacy invasion. The objective of the paper is to shed light on the trade-off between privacy and the business value of cus- tomer information. To that end, the authors review approaches to convert clickstream data into behavioral traits, which we call clickstream features, and propose a categorization of these features according to the potential threat they pose to user privacy. The authors then examine the extent to which different categories of clickstream features facilitate predictions of online user shopping pat- terns and approximate the marginal utility of using more privacy adverse information in behavioral prediction models. Thus, the paper links the literature on user privacy to that on e-commerce analytics and takes a step toward an economic analysis of privacy costs and benefits. In par- ticular, the results of empirical experimentation with large real-world e-commerce data suggest that the inclusion of short-term customer behavior based on session-related information leads to large gains in predictive accuracy and business performance, while storing and aggregating usage behavior over longer horizons has comparably less value
- …