4 research outputs found
Towards accurate detection of obfuscated web tracking
Web tracking is currently recognized as one of the most important privacy threats on the Internet. Over the last years, many methodologies have been developed to uncover web trackers. Most of them are based on static code analysis and the use of predefined blacklists. However, our main hypothesis is that web tracking has started to use obfuscated programming, a transformation of code that renders previous detection methodologies ineffective and easy to evade. In this paper, we propose a new methodology based on dynamic code analysis that monitors the actual JavaScript calls made by the browser and compares them to the original source code of the website in order to detect obfuscated tracking. The main advantage of this approach is that detection cannot be evaded by code obfuscation. We applied this methodology to detect the use of canvas-font tracking and canvas fingerprinting on the top-10K most visited websites according to Alexa's ranking. Canvas-based tracking is a fingerprinting method based on JavaScript that uses the HTML5 canvas element to uniquely identify a user. Our results show that 10.44% of the top-10K websites use canvas-based tracking (canvas-font and canvas fingerprinting), while obfuscation was used in 2.25% of them. These results confirm our initial hypothesis that obfuscated programming in web tracking is already in use. Finally, we argue that canvas-based tracking can be more present in secondary pages than in the home page of websites.Peer ReviewedPostprint (author's final draft
Invisible Pixels Are Dead, Long Live Invisible Pixels!
Privacy has deteriorated in the world wide web ever since the 1990s. The
tracking of browsing habits by different third-parties has been at the center
of this deterioration. Web cookies and so-called web beacons have been the
classical ways to implement third-party tracking. Due to the introduction of
more sophisticated technical tracking solutions and other fundamental
transformations, the use of classical image-based web beacons might be expected
to have lost their appeal. According to a sample of over thirty thousand images
collected from popular websites, this paper shows that such an assumption is a
fallacy: classical 1 x 1 images are still commonly used for third-party
tracking in the contemporary world wide web. While it seems that ad-blockers
are unable to fully block these classical image-based tracking beacons, the
paper further demonstrates that even limited information can be used to
accurately classify the third-party 1 x 1 images from other images. An average
classification accuracy of 0.956 is reached in the empirical experiment. With
these results the paper contributes to the ongoing attempts to better
understand the lack of privacy in the world wide web, and the means by which
the situation might be eventually improved.Comment: Forthcoming in the 17th Workshop on Privacy in the Electronic Society
(WPES 2018), Toronto, AC
Towards accurate detection of obfuscated web tracking
Web tracking is currently recognized as one of the most important privacy threats on the Internet. Over the last years, many methodologies have been developed to uncover web trackers. Most of them are based on static code analysis and the use of predefined blacklists. However, our main hypothesis is that web tracking has started to use obfuscated programming, a transformation of code that renders previous detection methodologies ineffective and easy to evade. In this paper, we propose a new methodology based on dynamic code analysis that monitors the actual JavaScript calls made by the browser and compares them to the original source code of the website in order to detect obfuscated tracking. The main advantage of this approach is that detection cannot be evaded by code obfuscation. We applied this methodology to detect the use of canvas-font tracking and canvas fingerprinting on the top-10K most visited websites according to Alexa's ranking. Canvas-based tracking is a fingerprinting method based on JavaScript that uses the HTML5 canvas element to uniquely identify a user. Our results show that 10.44% of the top-10K websites use canvas-based tracking (canvas-font and canvas fingerprinting), while obfuscation was used in 2.25% of them. These results confirm our initial hypothesis that obfuscated programming in web tracking is already in use. Finally, we argue that canvas-based tracking can be more present in secondary pages than in the home page of websites.Peer Reviewe