22 research outputs found
The Hitchhiker's Guide to Facebook Web Tracking with Invisible Pixels and Click IDs
Over the past years, advertisement companies have used various tracking
methods to persistently track users across the web. Such tracking methods
usually include first and third-party cookies, cookie synchronization, as well
as a variety of fingerprinting mechanisms. Facebook (FB) recently introduced a
new tagging mechanism that attaches a one-time tag as a URL parameter (FBCLID)
on outgoing links to other websites. Although such a tag does not seem to have
enough information to persistently track users, we demonstrate that despite its
ephemeral nature, when combined with FB Pixel, it can aid in persistently
monitoring user browsing behavior across i) different websites, ii) different
actions on each website, iii) time, i.e., both in the past as well as in the
future. We refer to this online monitoring of users as FB web tracking. We find
that FB Pixel tracks a wide range of user activities on websites with alarming
detail, especially on websites classified as sensitive categories under GDPR.
Also, we show how the FBCLID tag can be used to match, and thus de-anonymize,
activities of online users performed in the distant past (even before those
users had a FB account) tracked by FB Pixel. In fact, by combining this tag
with cookies that have rolling expiration dates, FB can also keep track of
users' browsing activities in the future as well. Our experimental results
suggest that 23% of the 10k most popular websites have adopted this technology,
and can contribute to this activity tracking on the web. Furthermore, our
longitudinal study shows that this type of user activity tracking can go as far
back as 2015. Simply said, if a user creates for the first time a FB account
today, FB could, under some conditions, match their anonymously collected past
web browsing activity to their newly created FB profile, from as far back as
2015 and continue tracking their activity in the future
Modern Socio-Technical Perspectives on Privacy
This open access book provides researchers and professionals with a foundational understanding of online privacy as well as insight into the socio-technical privacy issues that are most pertinent to modern information systems, covering several modern topics (e.g., privacy in social media, IoT) and underexplored areas (e.g., privacy accessibility, privacy for vulnerable populations, cross-cultural privacy). The book is structured in four parts, which follow after an introduction to privacy on both a technical and social level: Privacy Theory and Methods covers a range of theoretical lenses through which one can view the concept of privacy. The chapters in this part relate to modern privacy phenomena, thus emphasizing its relevance to our digital, networked lives. Next, Domains covers a number of areas in which privacy concerns and implications are particularly salient, including among others social media, healthcare, smart cities, wearable IT, and trackers. The Audiences section then highlights audiences that have traditionally been ignored when creating privacy-preserving experiences: people from other (non-Western) cultures, people with accessibility needs, adolescents, and people who are underrepresented in terms of their race, class, gender or sexual identity, religion or some combination. Finally, the chapters in Moving Forward outline approaches to privacy that move beyond one-size-fits-all solutions, explore ethical considerations, and describe the regulatory landscape that governs privacy through laws and policies. Perhaps even more so than the other chapters in this book, these chapters are forward-looking by using current personalized, ethical and legal approaches as a starting point for re-conceptualizations of privacy to serve the modern technological landscape. The bookâs primary goal is to inform IT students, researchers, and professionals about both the fundamentals of online privacy and the issues that are most pertinent to modern information systems. Lecturers or teacherscan assign (parts of) the book for a âprofessional issuesâ course. IT professionals may select chapters covering domains and audiences relevant to their field of work, as well as the Moving Forward chapters that cover ethical and legal aspects. Academicswho are interested in studying privacy or privacy-related topics will find a broad introduction in both technical and social aspects
EFFICIENT DATA PROTECTION BY NOISING, MASKING, AND METERING
Protecting data secrecy is an important design goal of computing systems. Conventional techniques like access control mechanisms and cryptography are widely deployed, and yet security breaches and data leakages still occur. There are several challenges. First, sensitivity of the system data is not always easy to decide. Second, trustworthiness is not a constant property of the system components and users. Third, a systemâs functional requirements can be at odds with its data protection requirements. In this dissertation, we show that efficient data protection can be achieved by noising, masking, or metering sensitive data. Specifically, three practical problems are addressed in the dissertationâstorage side-channel attacks in Linux, server anonymity violations in web sessions, and data theft by malicious insiders. To mitigate storage side-channel attacks, we introduce a differentially private system, dpprocfs, which injects noise into side-channel vectors and also reestablishes invariants on the noised outputs. Our evaluations show that dpprocfs mitigates known storage side channels while preserving the utility of the proc filesystem for monitoring and diagnosis. To enforce server anonymity, we introduce a cloud service, PoPSiCl, which masks server identifiers, including DNS names and IP addresses, with personalized pseudonyms. PoPSiCl can defend against both passive and active network attackers with minimal impact to web-browsing performance. To prevent data theft from insiders, we introduce a system, Snowman, which restricts the user to access data only remotely and accurately meters the sensitive data output to the user by conducting taint analysis in a replica of the application execution without slowing the interactive user session.Doctor of Philosoph
Privacy by (re)design: a comparative study of the protection of personal information in the mobile applications ecosystem under United States, European Union and South African law.
Doctoral Degree. University of KwaZulu-Natal, Durban.The dissertation presents a comparative desktop study of the application of a Privacy by Design
(PbD) approach to the protection of personal information in the mobile applications ecosystem
under the Childrenâs Online Privacy Protection Act (COPPA) and the California Consumer
Protection Act (CCPA) in the United States, the General Data Protection Regulation (GDPR)
in the European Union, and the Protection of Personal Information Act (POPIA) in South
Africa.
The main problem considered in the thesis is whether there is an âaccountability
gapâ within the legislation selected for comparative study. This is analysed by examining
whether the legislation can be enforced against parties other than the app developer in the
mobile app ecosystem, as it is theorised that only on this basis will the underlying technologies
and architecture of mobile apps be changed to support a privacy by (re)design approach. The
key research question is what legal approach is to be adopted to enforce such an approach
within the mobile apps ecosystem.
It describes the complexity of the mobile apps ecosystem, identifying the key
role players and the processing operations that take place.
It sets out what is encompassed by the conceptual framework of PbD, and why
the concept of privacy by (re)design may be more appropriate in the context of mobile apps
integrating third party services and products. It identifies the core data protection principles of
data minimisation and accountability, and the nature of informed consent, as being essential to
an effective PbD approach.
It concludes that without strengthening the legal obligations pertaining to the
sharing of personal information with third parties, neither regulatory guidance, as is preferred
in the United States, nor a direct legal obligation, as created by article 25 of the GDPR, is
adequate to enforce a PbD approach within the mobile apps ecosystem. It concludes that
although a PbD approach is implied for compliance by a responsible party with POPIA,
legislative reforms are necessary. It proposes amendments to POPIA to address inadequacies
in the requirements for notice, and to impose obligations on a responsible party in relation to
the sharing of personal information with third parties who will process the personal information
for further, separate purposes
Recommended from our members
Location Data: Perils, Profits, Promise
Most of the modern online economy is based on websites offering free services and content in exchange for advertising access and user data. Web companies collect vast troves of data about their users in order to better target their advertisements. An important subset of this harvested data is the locations visited by users. Location data is valuable as it is a ``real world" signal compared to online behaviors: a visit to a store is a stronger signal than a visit to a website, and location data can reveal user attributes that are interesting to advertisers. The collection of this data, however, raises many concerns. Location data can reveal important attributes that users may not wish to disclose: ZIP codes can reveal income and race, visits to places of worship may allow discrimination, and insurers may want to know about trips to hospitals. The risks exist at both an individual level, with location tied to physical safety, and at a collective level, with inference about group membership a necessary step towards discrimination. In this thesis, I examine issues of privacy and fairness in the use of location data. In the first portion, I empirically demonstrate new attacks on the anonymity and privacy of users, including a theoretical basis for user identification. In the second portion, I propose and analyze new solutions for dealing with privacy, anonymity, and fairness in the collection and use of location data. In contrast to previous work which presents privacy in abstract ways or ignores the power of data aggregators, the work presented here focuses on concretely informing users and incorporates the economic incentives driving privacy and fairness concerns
Online Privacy in Mobile and Web Platforms: Risk Quantification and Obfuscation Techniques
The wide-spread use of the web and mobile platforms and their high engagement in human lives pose serious threats to the privacy and confidentiality of users. It has been demonstrated in a number of research works that devices, such as desktops, mobile, and web browsers contain subtle information and measurable variation, which allow them to be fingerprinted. Moreover, behavioural tracking is another form of privacy threat that is induced by the collection and monitoring of users gestures such as touch, motion, GPS, search queries, writing pattern, and more. The success of these methods is a clear indication that obfuscation techniques to protect the privacy of individuals, in reality, are not successful if the collected data contains potentially unique combinations of attributes relating to specific individuals.
With this in view, this thesis focuses on understanding the privacy risks across the web and mobile platforms by identifying and quantifying the privacy leakages and then designing privacy preserving frameworks against identified threats. We first investigate the potential of using touch-based gestures to track mobile device users. For this purpose, we propose and develop an analytical framework that quantifies the amount of information carried by the user touch gestures. We then quantify users privacy risk in the web data using probabilistic method that incorporates all key privacy aspects, which are uniqueness, uniformity, and linkability of the web data. We also perform a large-scale study of dependency chains in the web and find that a large proportion of websites under-study load resources from suspicious third-parties that are known to mishandle user data and risk privacy leaks.
The second half of the thesis addresses the abovementioned identified privacy risks by designing and developing privacy preserving frameworks for the web and mobile platforms. We propose an on-device privacy preserving framework that minimizes privacy leakages by bringing down the risk of trackability and distinguishability of mobile users while preserving the functionality of the existing apps/services. We finally propose a privacy-aware obfuscation framework for the web data having high predicted risk. Using differentially-private noise addition, our proposed framework is resilient against adversary who has knowledge about the obfuscation mechanism, HMM probabilities and the training dataset
Recommended from our members
Understanding Flaws in the Deployment and Implementation of Web Encryption
In recent years, the web has switched from using the unencrypted HTTP protocol to using encrypted communications. Primarily, this resulted in increasing deployment of TLS to mitigate information leakage over the network. This development has led many web service operators to mistakenly think that migrating from HTTP to HTTPS will magically protect them from information leakage without any additional effort on their end to guar- antee the desired security properties. In reality, despite the fact that there exists enough infrastructure in place and the protocols have been âtestedâ (by virtue of being in wide, but not ubiquitous, use for many years), deploying HTTPS is a highly challenging task due to the technical complexity of its underlying protocols (i.e., HTTP, TLS) as well as the complexity of the TLS certificate ecosystem and this of popular client applications such as web browsers. For example, we found that many websites still avoid ubiquitous encryption and force only critical functionality and sensitive data access over encrypted connections while allowing more innocuous functionality to be accessed over HTTP. In practice, this approach is prone to flaws that can expose sensitive information or functionality to third parties. Thus, it is crucial for developers to verify the correctness of their deployments and implementations.
In this dissertation, in an effort to improve usersâ privacy, we highlight semantic flaws in the implementations of both web servers and clients, caused by the improper deployment of web encryption protocols. First, we conduct an in-depth assessment of major websites and explore what functionality and information is exposed to attackers that have hijacked a userâs HTTP cookies. We identify a recurring pattern across websites with partially de- ployed HTTPS, namely, that service personalization inadvertently results in the exposure of private information. The separation of functionality across multiple cookies with different scopes and inter-dependencies further complicates matters, as imprecise access control renders restricted account functionality accessible to non-secure cookies. Our cookie hijacking study reveals a number of severe flaws; for example, attackers can obtain the userâs saved address and visited websites from e.g., Google, Bing, and Yahoo allow attackers to extract the contact list and send emails from the userâs account. To estimate the extent of the threat, we run measurements on a university public wireless network for a period of 30 days and detect over 282K accounts exposing the cookies required for our hijacking attacks.
Next, we explore and study security mechanisms purposed to eliminate this problem by enforcing encryption such as HSTS and HTTPS Everywhere. We evaluate each mechanism in terms of its adoption and effectiveness. We find that all mechanisms suffer from implementation flaws or deployment issues and argue that, as long as servers continue to not support ubiquitous encryption across their entire domain, no mechanism can effectively protect users from cookie hijacking and information leakage.
Finally, as the security guarantees of TLS (in turn HTTPS), are critically dependent on the correct validation of X.509 server certificates, we study hostname verification, a critical component in the certificate validation process. We develop HVLearn, a novel testing framework to verify the correctness of hostname verification implementations and use HVLearn to analyze a number of popular TLS libraries and applications. To this end, we found 8 unique violations of the RFC specifications. Several of these violations are critical and can render the affected implementations vulnerable to man-in-the-middle attacks
Recommended from our members
Using Machine Learning to improve Internet Privacy
Internet privacy lacks transparency, choice, quantifiability, and accountability, especially, as the deployment of machine learning technologies becomes mainstream. However, these technologies can be both privacy-invasive as well as privacy-protective. This dissertation advances the thesis that machine learning can be used for purposes of improving Internet privacy. Starting with a case study that shows how the potential of a social network to learn ethnicity and gender of its users from geotags can be estimated, various strands of machine learning technologies to further privacy are explored. While the quantification of privacy is the subject of well-known privacy metrics, such as k-anonymity or differential privacy, I discuss how some of those metrics can be leveraged in tandem with machine learning algorithms for purposes of quantifying the privacy-invasiveness of data collection practices. Further, I demonstrate how the current notice-and-choice paradigm can be realized by automatic machine learning privacy policy analysis. The implemented system notifies users efficiently and accurately on applicable data practices. Further, by analyzing software data flows users are enabled to compare actual to described data practices and regulators can enforce those at scale. The emerging cross-device tracking practices of ad networks, analytics companies, and others can be supplemented by machine learning technologies as well to notify users of privacy practices across devices and give them the choice they are entitled to by law. Ultimately, cross-device tracking is a harbinger of the emerging Internet of Things, for which I envision intelligent personal assistants that help users navigating through the increasing complexity of privacy notices and choices
Regulatory Disruption and Arbitrage in Health-Care Data Protection
This article explains how the structure of U.S. health-care data protection(specifically its sectoral and downstream properties) has led to a chronically uneven policy environment for different types of health-care data. It examines claims for health-care data protection exceptionalism and competing demands such as data liquidity. In conclusion, the article takes the position that health care-data exceptionalism remains a valid imperative and that even current concerns about data liquidity can be accommodated in an exceptional protective model. However, re-calibrating our protection of health-care data residing outside of the traditional health-care domain is challenging, currently evenpolitically impossible