1,320 research outputs found
Analyzing Privacy Policies Using Contextual Integrity Annotations
In this paper, we demonstrate the effectiveness of using the theory of
contextual integrity (CI) to annotate and evaluate privacy policy statements.
We perform a case study using CI annotations to compare Facebook's privacy
policy before and after the Cambridge Analytica scandal. The updated Facebook
privacy policy provides additional details about what information is being
transferred, from whom, by whom, to whom, and under what conditions. However,
some privacy statements prescribe an incomprehensibly large number of
information flows by including many CI parameters in single statements. Other
statements result in incomplete information flows due to the use of vague terms
or omitting contextual parameters altogether. We then demonstrate that
crowdsourcing can effectively produce CI annotations of privacy policies at
scale. We test the CI annotation task on 48 excerpts of privacy policies from
17 companies with 141 crowdworkers. The resulting high precision annotations
indicate that crowdsourcing could be used to produce a large corpus of
annotated privacy policies for future research.Comment: 18 pages, 9 figures, 5 table
Predicting Annotation Difficulty to Improve Task Routing and Model Performance for Biomedical Information Extraction
Modern NLP systems require high-quality annotated data. In specialized
domains, expert annotations may be prohibitively expensive. An alternative is
to rely on crowdsourcing to reduce costs at the risk of introducing noise. In
this paper we demonstrate that directly modeling instance difficulty can be
used to improve model performance, and to route instances to appropriate
annotators. Our difficulty prediction model combines two learned
representations: a `universal' encoder trained on out-of-domain data, and a
task-specific encoder. Experiments on a complex biomedical information
extraction task using expert and lay annotators show that: (i) simply excluding
from the training data instances predicted to be difficult yields a small boost
in performance; (ii) using difficulty scores to weight instances during
training provides further, consistent gains; (iii) assigning instances
predicted to be difficult to domain experts is an effective strategy for task
routing. Our experiments confirm the expectation that for specialized tasks
expert annotations are higher quality than crowd labels, and hence preferable
to obtain if practical. Moreover, augmenting small amounts of expert data with
a larger set of lay annotations leads to further improvements in model
performance.Comment: NAACL201
The 2018 DAVIS Challenge on Video Object Segmentation
We present the 2018 DAVIS Challenge on Video Object Segmentation, a public
competition specifically designed for the task of video object segmentation. It
builds upon the DAVIS 2017 dataset, which was presented in the previous edition
of the DAVIS Challenge, and added 100 videos with multiple objects per sequence
to the original DAVIS 2016 dataset. Motivated by the analysis of the results of
the 2017 edition, the main track of the competition will be the same than in
the previous edition (segmentation given the full mask of the objects in the
first frame -- semi-supervised scenario). This edition, however, also adds an
interactive segmentation teaser track, where the participants will interact
with a web service simulating the input of a human that provides scribbles to
iteratively improve the result.Comment: Challenge website: http://davischallenge.org
Fast Context-Annotated Classification of Different Types of Web Service Descriptions
In the recent rapid growth of web services, IoT, and cloud computing, many
web services and APIs appeared on the web. With the failure of global UDDI
registries, different service repositories started to appear, trying to list
and categorize various types of web services for client applications' discover
and use. In order to increase the effectiveness and speed up the task of
finding compatible Web Services in the brokerage when performing service
composition or suggesting Web Services to the requests, high-level
functionality of the service needs to be determined. Due to the lack of
structured support for specifying such functionality, classification of
services into a set of abstract categories is necessary. We employ a wide range
of Machine Learning and Signal Processing algorithms and techniques in order to
find the highest precision achievable in the scope of this article for the fast
classification of three type of service descriptions: WSDL, REST, and WADL. In
addition, we complement our approach by showing the importance and effect of
contextual information on the classification of the service descriptions and
show that it improves the accuracy in 5 different categories of services.Comment: 20 pages expanded; ICPRAI 2018 conference proceedings, pp. 562-570,
CENPARMI, Concordia University, Montrea
EveTAR: Building a Large-Scale Multi-Task Test Collection over Arabic Tweets
This article introduces a new language-independent approach for creating a
large-scale high-quality test collection of tweets that supports multiple
information retrieval (IR) tasks without running a shared-task campaign. The
adopted approach (demonstrated over Arabic tweets) designs the collection
around significant (i.e., popular) events, which enables the development of
topics that represent frequent information needs of Twitter users for which
rich content exists. That inherently facilitates the support of multiple tasks
that generally revolve around events, namely event detection, ad-hoc search,
timeline generation, and real-time summarization. The key highlights of the
approach include diversifying the judgment pool via interactive search and
multiple manually-crafted queries per topic, collecting high-quality
annotations via crowd-workers for relevancy and in-house annotators for
novelty, filtering out low-agreement topics and inaccessible tweets, and
providing multiple subsets of the collection for better availability. Applying
our methodology on Arabic tweets resulted in EveTAR , the first
freely-available tweet test collection for multiple IR tasks. EveTAR includes a
crawl of 355M Arabic tweets and covers 50 significant events for which about
62K tweets were judged with substantial average inter-annotator agreement
(Kappa value of 0.71). We demonstrate the usability of EveTAR by evaluating
existing algorithms in the respective tasks. Results indicate that the new
collection can support reliable ranking of IR systems that is comparable to
similar TREC collections, while providing strong baseline results for future
studies over Arabic tweets
Pushing the Boundaries of Crowd-enabled Databases with Query-driven Schema Expansion
By incorporating human workers into the query execution process crowd-enabled
databases facilitate intelligent, social capabilities like completing missing
data at query time or performing cognitive operators. But despite all their
flexibility, crowd-enabled databases still maintain rigid schemas. In this
paper, we extend crowd-enabled databases by flexible query-driven schema
expansion, allowing the addition of new attributes to the database at query
time. However, the number of crowd-sourced mini-tasks to fill in missing values
may often be prohibitively large and the resulting data quality is doubtful.
Instead of simple crowd-sourcing to obtain all values individually, we leverage
the user-generated data found in the Social Web: By exploiting user ratings we
build perceptual spaces, i.e., highly-compressed representations of opinions,
impressions, and perceptions of large numbers of users. Using few training
samples obtained by expert crowd sourcing, we then can extract all missing data
automatically from the perceptual space with high quality and at low costs.
Extensive experiments show that our approach can boost both performance and
quality of crowd-enabled databases, while also providing the flexibility to
expand schemas in a query-driven fashion.Comment: VLDB201
Human-Assisted Graph Search: It's Okay to Ask Questions
We consider the problem of human-assisted graph search: given a directed
acyclic graph with some (unknown) target node(s), we consider the problem of
finding the target node(s) by asking an omniscient human questions of the form
"Is there a target node that is reachable from the current node?". This general
problem has applications in many domains that can utilize human intelligence,
including curation of hierarchies, debugging workflows, image segmentation and
categorization, interactive search and filter synthesis. To our knowledge, this
work provides the first formal algorithmic study of the optimization of human
computation for this problem. We study various dimensions of the problem space,
providing algorithms and complexity results. Our framework and algorithms can
be used in the design of an optimizer for crowd-sourcing platforms such as
Mechanical Turk.Comment: VLDB201
3D BAT: A Semi-Automatic, Web-based 3D Annotation Toolbox for Full-Surround, Multi-Modal Data Streams
In this paper, we focus on obtaining 2D and 3D labels, as well as track IDs
for objects on the road with the help of a novel 3D Bounding Box Annotation
Toolbox (3D BAT). Our open source, web-based 3D BAT incorporates several smart
features to improve usability and efficiency. For instance, this annotation
toolbox supports semi-automatic labeling of tracks using interpolation, which
is vital for downstream tasks like tracking, motion planning and motion
prediction. Moreover, annotations for all camera images are automatically
obtained by projecting annotations from 3D space into the image domain. In
addition to the raw image and point cloud feeds, a Masterview consisting of the
top view (bird's-eye-view), side view and front views is made available to
observe objects of interest from different perspectives. Comparisons of our
method with other publicly available annotation tools reveal that 3D
annotations can be obtained faster and more efficiently by using our toolbox
Visualisation of semantic enrichment
Automatically creating semantic enrichments for text may lead to annotations that allow for excellent recall but poor precision. Manual enrichment is potentially more targeted, leading to greater precision. We aim to support nonexperts in manually enriching texts with semantic annotations. Neither the visualisation of semantic enrichment nor the process of manually enriching texts has been evaluated before. This paper presents the results of our user study on visualisation of text enrichment during the annotation process. We performed extensive analysis of work related to the visualisation of semantic annotations. In a prototype implementation, we then explored two layout alternatives for visualising semantic annotations and their linkage to the text atoms. Here we summarise and discuss our results and their design implications for tools creating semantic annotations
Satyam: Democratizing Groundtruth for Machine Vision
The democratization of machine learning (ML) has led to ML-based machine
vision systems for autonomous driving, traffic monitoring, and video
surveillance. However, true democratization cannot be achieved without greatly
simplifying the process of collecting groundtruth for training and testing
these systems. This groundtruth collection is necessary to ensure good
performance under varying conditions. In this paper, we present the design and
evaluation of Satyam, a first-of-its-kind system that enables a layperson to
launch groundtruth collection tasks for machine vision with minimal effort.
Satyam leverages a crowdtasking platform, Amazon Mechanical Turk, and automates
several challenging aspects of groundtruth collection: creating and launching
of custom web-UI tasks for obtaining the desired groundtruth, controlling
result quality in the face of spammers and untrained workers, adapting prices
to match task complexity, filtering spammers and workers with poor performance,
and processing worker payments. We validate Satyam using several popular
benchmark vision datasets, and demonstrate that groundtruth obtained by Satyam
is comparable to that obtained from trained experts and provides matching ML
performance when used for training
- …