107,204 research outputs found
Disagreeable Privacy Policies: Mismatches between Meaning and Usersâ Understanding
Privacy policies are verbose, difficult to understand, take too long to read, and may be the least-read items on most websites even as users express growing concerns about information collection practices. For all their faults, though, privacy policies remain the single most important source of information for users to attempt to learn how companies collect, use, and share data. Likewise, these policies form the basis for the self-regulatory notice and choice framework that is designed and promoted as a replacement for regulation. The underlying value and legitimacy of notice and choice depends, however, on the ability of users to understand privacy policies.
This paper investigates the differences in interpretation among expert, knowledgeable, and typical users and explores whether those groups can understand the practices described in privacy policies at a level sufficient to support rational decision-making. The paper seeks to fill an important gap in the understanding of privacy policies through primary research on user interpretation and to inform the development of technologies combining natural language processing, machine learning and crowdsourcing for policy interpretation and summarization.
For this research, we recruited a group of law and public policy graduate students at Fordham University, Carnegie Mellon University, and the University of Pittsburgh (âknowledgeable usersâ) and presented these law and policy researchers with a set of privacy policies from companies in the e-commerce and news & entertainment industries. We asked them nine basic questions about the policiesâ statements regarding data collection, data use, and retention. We then presented the same set of policies to a group of privacy experts and to a group of non-expert users.
The findings show areas of common understanding across all groups for certain data collection and deletion practices, but also demonstrate very important discrepancies in the interpretation of privacy policy language, particularly with respect to data sharing. The discordant interpretations arose both within groups and between the experts and the two other groups.
The presence of these significant discrepancies has critical implications. First, the common understandings of some attributes of described data practices mean that semi-automated extraction of meaning from website privacy policies may be able to assist typical users and improve the effectiveness of notice by conveying the true meaning to users. However, the disagreements among experts and disagreement between experts and the other groups reflect that ambiguous wording in typical privacy policies undermines the ability of privacy policies to effectively convey notice of data practices to the general public.
The results of this research will, consequently, have significant policy implications for the construction of the notice and choice framework and for the US reliance on this approach. The gap in interpretation indicates that privacy policies may be misleading the general public and that those policies could be considered legally unfair and deceptive. And, where websites are not effectively conveying privacy policies to consumers in a way that a âreasonable personâ could, in fact, understand the policies, ânotice and choiceâ fails as a framework. Such a failure has broad international implications since websites extend their reach beyond the United States
ATP and Presentation Service for Mizar Formalizations
This paper describes the Automated Reasoning for Mizar (MizAR) service, which
integrates several automated reasoning, artificial intelligence, and
presentation tools with Mizar and its authoring environment. The service
provides ATP assistance to Mizar authors in finding and explaining proofs, and
offers generation of Mizar problems as challenges to ATP systems. The service
is based on a sound translation from the Mizar language to that of first-order
ATP systems, and relies on the recent progress in application of ATP systems in
large theories containing tens of thousands of available facts. We present the
main features of MizAR services, followed by an account of initial experiments
in finding proofs with the ATP assistance. Our initial experience indicates
that the tool offers substantial help in exploring the Mizar library and in
preparing new Mizar articles
ServeNet: A Deep Neural Network for Web Services Classification
Automated service classification plays a crucial role in service discovery,
selection, and composition. Machine learning has been widely used for service
classification in recent years. However, the performance of conventional
machine learning methods highly depends on the quality of manual feature
engineering. In this paper, we present a novel deep neural network to
automatically abstract low-level representation of both service name and
service description to high-level merged features without feature engineering
and the length limitation, and then predict service classification on 50
service categories. To demonstrate the effectiveness of our approach, we
conduct a comprehensive experimental study by comparing 10 machine learning
methods on 10,000 real-world web services. The result shows that the proposed
deep neural network can achieve higher accuracy in classification and more
robust than other machine learning methods.Comment: Accepted by ICWS'2
Mobile Phone Text Processing and Question-Answering
Mobile phone text messaging between mobile users and information services is a growing area of
Information Systems. Users may require the service to provide an answer to queries, or may, in wikistyle, want to contribute to the service by texting in some information within the serviceâs domain of discourse. Given the volume of such messaging it is essential to do the processing through an automated service. Further, in the case of repeated use of the service, the quality of such a response has the potential to benefit from a dynamic user profile that the service can build up from previous texts of the same user.
This project will investigate the potential for creating such intelligent mobile phone services and aims to produce a computational model to enable their efficient implementation. To make the project feasible, the scope of the automated service is considered to lie within a limited domain of, for example, information about entertainment within a specific town centre. The project will assume the existence of a model of objects within the domain of discourse, hence allowing the analysis of texts within the context of a user model and a domain model. Hence, the project will involve the subject areas of natural language processing, language engineering, machine learning, knowledge extraction, and ontological engineering
HOL(y)Hammer: Online ATP Service for HOL Light
HOL(y)Hammer is an online AI/ATP service for formal (computer-understandable)
mathematics encoded in the HOL Light system. The service allows its users to
upload and automatically process an arbitrary formal development (project)
based on HOL Light, and to attack arbitrary conjectures that use the concepts
defined in some of the uploaded projects. For that, the service uses several
automated reasoning systems combined with several premise selection methods
trained on all the project proofs. The projects that are readily available on
the server for such query answering include the recent versions of the
Flyspeck, Multivariate Analysis and Complex Analysis libraries. The service
runs on a 48-CPU server, currently employing in parallel for each task 7 AI/ATP
combinations and 4 decision procedures that contribute to its overall
performance. The system is also available for local installation by interested
users, who can customize it for their own proof development. An Emacs interface
allowing parallel asynchronous queries to the service is also provided. The
overall structure of the service is outlined, problems that arise and their
solutions are discussed, and an initial account of using the system is given
Recognizing cited facts and principles in legal judgements
In common law jurisdictions, legal professionals cite facts and legal principles from precedent cases to support their arguments before the court for their intended outcome in a current case. This practice stems from the doctrine of stare decisis, where cases that have similar facts should receive similar decisions with respect to the principles. It is essential for legal professionals to identify such facts and principles in precedent cases, though this is a highly time intensive task. In this paper, we present studies that demonstrate that human annotators can achieve reasonable agreement on which sentences in legal judgements contain cited facts and principles (respectively, Îș=0.65 and Îș=0.95 for inter- and intra-annotator agreement). We further demonstrate that it is feasible to automatically annotate sentences containing such legal facts and principles in a supervised machine learning framework based on linguistic features, reporting per category precision and recall figures of between 0.79 and 0.89 for classifying sentences in legal judgements as cited facts, principles or neither using a Bayesian classifier, with an overall Îș of 0.72 with the human-annotated gold standard
"How May I Help You?": Modeling Twitter Customer Service Conversations Using Fine-Grained Dialogue Acts
Given the increasing popularity of customer service dialogue on Twitter,
analysis of conversation data is essential to understand trends in customer and
agent behavior for the purpose of automating customer service interactions. In
this work, we develop a novel taxonomy of fine-grained "dialogue acts"
frequently observed in customer service, showcasing acts that are more suited
to the domain than the more generic existing taxonomies. Using a sequential
SVM-HMM model, we model conversation flow, predicting the dialogue act of a
given turn in real-time. We characterize differences between customer and agent
behavior in Twitter customer service conversations, and investigate the effect
of testing our system on different customer service industries. Finally, we use
a data-driven approach to predict important conversation outcomes: customer
satisfaction, customer frustration, and overall problem resolution. We show
that the type and location of certain dialogue acts in a conversation have a
significant effect on the probability of desirable and undesirable outcomes,
and present actionable rules based on our findings. The patterns and rules we
derive can be used as guidelines for outcome-driven automated customer service
platforms.Comment: 13 pages, 6 figures, IUI 201
Automated assessment of non-native learner essays: Investigating the role of linguistic features
Automatic essay scoring (AES) refers to the process of scoring free text
responses to given prompts, considering human grader scores as the gold
standard. Writing such essays is an essential component of many language and
aptitude exams. Hence, AES became an active and established area of research,
and there are many proprietary systems used in real life applications today.
However, not much is known about which specific linguistic features are useful
for prediction and how much of this is consistent across datasets. This article
addresses that by exploring the role of various linguistic features in
automatic essay scoring using two publicly available datasets of non-native
English essays written in test taking scenarios. The linguistic properties are
modeled by encoding lexical, syntactic, discourse and error types of learner
language in the feature set. Predictive models are then developed using these
features on both datasets and the most predictive features are compared. While
the results show that the feature set used results in good predictive models
with both datasets, the question "what are the most predictive features?" has a
different answer for each dataset.Comment: Article accepted for publication at: International Journal of
Artificial Intelligence in Education (IJAIED). To appear in early 2017
(journal url: http://www.springer.com/computer/ai/journal/40593
- âŠ