288 research outputs found
An Automated Approach to Auditing Disclosure of Third-Party Data Collection in Website Privacy Policies
A dominant regulatory model for web privacy is "notice and choice". In this
model, users are notified of data collection and provided with options to
control it. To examine the efficacy of this approach, this study presents the
first large-scale audit of disclosure of third-party data collection in website
privacy policies. Data flows on one million websites are analyzed and over
200,000 websites' privacy policies are audited to determine if users are
notified of the names of the companies which collect their data. Policies from
25 prominent third-party data collectors are also examined to provide deeper
insights into the totality of the policy environment. Policies are additionally
audited to determine if the choice expressed by the "Do Not Track" browser
setting is respected.
Third-party data collection is wide-spread, but fewer than 15% of attributed
data flows are disclosed. The third-parties most likely to be disclosed are
those with consumer services users may be aware of, those without consumer
services are less likely to be mentioned. Policies are difficult to understand
and the average time requirement to read both a given site{\guillemotright}s
policy and the associated third-party policies exceeds 84 minutes. Only 7% of
first-party site policies mention the Do Not Track signal, and the majority of
such mentions are to specify that the signal is ignored. Among third-party
policies examined, none offer unqualified support for the Do Not Track signal.
Findings indicate that current implementations of "notice and choice" fail to
provide notice or respect choice
Automatic Detection of Vague Words and Sentences in Privacy Policies
Website privacy policies represent the single most important source of
information for users to gauge how their personal data are collected, used and
shared by companies. However, privacy policies are often vague and people
struggle to understand the content. Their opaqueness poses a significant
challenge to both users and policy regulators. In this paper, we seek to
identify vague content in privacy policies. We construct the first corpus of
human-annotated vague words and sentences and present empirical studies on
automatic vagueness detection. In particular, we investigate context-aware and
context-agnostic models for predicting vague words, and explore
auxiliary-classifier generative adversarial networks for characterizing
sentence vagueness. Our experimental results demonstrate the effectiveness of
proposed approaches. Finally, we provide suggestions for resolving vagueness
and improving the usability of privacy policies.Comment: 10 page
Recommended from our members
Statistical analysis of identity risk of exposure and cost using the ecosystem of identity attributes
Personally Identifiable Information (PII) is often called the "currency of the Internet" as identity assets are collected, shared, sold, and used for almost every transaction on the Internet. PII is used for all types of applications from access control to credit score calculations to targeted advertising. Every market sector relies on PII to know and authenticate their customers and their employees. With so many businesses and government agencies relying on PII to make important decisions and so many people being asked to share personal data, it is critical to better understand the fundamentals of identity to protect it and responsibly use it. Previously developed comprehensive Identity Ecosystem utilizes graphs to model PII assets and their relationships and is powered by empirical data from almost 6,000 real-world identity theft and fraud news reports to populate the UT CID Identity Ecosystem. We analyze UT CID Identity Ecosystem using graph theory and report numerous novel statistics using identity asset content, structure, value, accessibility, and impact. Our work sheds light on how identity is used and paves the way for improving identity protection.Electrical and Computer Engineerin
After Over-Privileged Permissions: Using Technology and Design to Create Legal Compliance
Consumers in the mobile ecosystem can putatively protect their privacy with the use of application permissions. However, this requires the mobile device owners to understand permissions and their privacy implications. Yet, few consumers appreciate the nature of permissions within the mobile ecosystem, often failing to appreciate the privacy permissions that are altered when updating an app. Even more concerning is the lack of understanding of the wide use of third-party libraries, most which are installed with automatic permissions, that is permissions that must be granted to allow the application to function appropriately. Unsurprisingly, many of these third-party permissions violate consumers’ privacy expectations and thereby, become “over-privileged” to the user. Consequently, an obscurity of privacy expectations between what is practiced by the private sector and what is deemed appropriate by the public sector is exhibited. Despite the growing attention given to privacy in the mobile ecosystem, legal literature has largely ignored the implications of mobile permissions. This article seeks to address this omission by analyzing the impacts of mobile permissions and the privacy harms experienced by consumers of mobile applications. The authors call for the review of industry self-regulation and the overreliance upon simple notice and consent. Instead, the authors set out a plan for greater attention to be paid to socio-technical solutions, focusing on better privacy protections and technology embedded within the automatic permission-based application ecosystem
- …