1,320 research outputs found

    Analyzing Privacy Policies Using Contextual Integrity Annotations

    Full text link
    In this paper, we demonstrate the effectiveness of using the theory of contextual integrity (CI) to annotate and evaluate privacy policy statements. We perform a case study using CI annotations to compare Facebook's privacy policy before and after the Cambridge Analytica scandal. The updated Facebook privacy policy provides additional details about what information is being transferred, from whom, by whom, to whom, and under what conditions. However, some privacy statements prescribe an incomprehensibly large number of information flows by including many CI parameters in single statements. Other statements result in incomplete information flows due to the use of vague terms or omitting contextual parameters altogether. We then demonstrate that crowdsourcing can effectively produce CI annotations of privacy policies at scale. We test the CI annotation task on 48 excerpts of privacy policies from 17 companies with 141 crowdworkers. The resulting high precision annotations indicate that crowdsourcing could be used to produce a large corpus of annotated privacy policies for future research.Comment: 18 pages, 9 figures, 5 table

    Predicting Annotation Difficulty to Improve Task Routing and Model Performance for Biomedical Information Extraction

    Full text link
    Modern NLP systems require high-quality annotated data. In specialized domains, expert annotations may be prohibitively expensive. An alternative is to rely on crowdsourcing to reduce costs at the risk of introducing noise. In this paper we demonstrate that directly modeling instance difficulty can be used to improve model performance, and to route instances to appropriate annotators. Our difficulty prediction model combines two learned representations: a `universal' encoder trained on out-of-domain data, and a task-specific encoder. Experiments on a complex biomedical information extraction task using expert and lay annotators show that: (i) simply excluding from the training data instances predicted to be difficult yields a small boost in performance; (ii) using difficulty scores to weight instances during training provides further, consistent gains; (iii) assigning instances predicted to be difficult to domain experts is an effective strategy for task routing. Our experiments confirm the expectation that for specialized tasks expert annotations are higher quality than crowd labels, and hence preferable to obtain if practical. Moreover, augmenting small amounts of expert data with a larger set of lay annotations leads to further improvements in model performance.Comment: NAACL201

    The 2018 DAVIS Challenge on Video Object Segmentation

    Full text link
    We present the 2018 DAVIS Challenge on Video Object Segmentation, a public competition specifically designed for the task of video object segmentation. It builds upon the DAVIS 2017 dataset, which was presented in the previous edition of the DAVIS Challenge, and added 100 videos with multiple objects per sequence to the original DAVIS 2016 dataset. Motivated by the analysis of the results of the 2017 edition, the main track of the competition will be the same than in the previous edition (segmentation given the full mask of the objects in the first frame -- semi-supervised scenario). This edition, however, also adds an interactive segmentation teaser track, where the participants will interact with a web service simulating the input of a human that provides scribbles to iteratively improve the result.Comment: Challenge website: http://davischallenge.org

    Fast Context-Annotated Classification of Different Types of Web Service Descriptions

    Full text link
    In the recent rapid growth of web services, IoT, and cloud computing, many web services and APIs appeared on the web. With the failure of global UDDI registries, different service repositories started to appear, trying to list and categorize various types of web services for client applications' discover and use. In order to increase the effectiveness and speed up the task of finding compatible Web Services in the brokerage when performing service composition or suggesting Web Services to the requests, high-level functionality of the service needs to be determined. Due to the lack of structured support for specifying such functionality, classification of services into a set of abstract categories is necessary. We employ a wide range of Machine Learning and Signal Processing algorithms and techniques in order to find the highest precision achievable in the scope of this article for the fast classification of three type of service descriptions: WSDL, REST, and WADL. In addition, we complement our approach by showing the importance and effect of contextual information on the classification of the service descriptions and show that it improves the accuracy in 5 different categories of services.Comment: 20 pages expanded; ICPRAI 2018 conference proceedings, pp. 562-570, CENPARMI, Concordia University, Montrea

    EveTAR: Building a Large-Scale Multi-Task Test Collection over Arabic Tweets

    Full text link
    This article introduces a new language-independent approach for creating a large-scale high-quality test collection of tweets that supports multiple information retrieval (IR) tasks without running a shared-task campaign. The adopted approach (demonstrated over Arabic tweets) designs the collection around significant (i.e., popular) events, which enables the development of topics that represent frequent information needs of Twitter users for which rich content exists. That inherently facilitates the support of multiple tasks that generally revolve around events, namely event detection, ad-hoc search, timeline generation, and real-time summarization. The key highlights of the approach include diversifying the judgment pool via interactive search and multiple manually-crafted queries per topic, collecting high-quality annotations via crowd-workers for relevancy and in-house annotators for novelty, filtering out low-agreement topics and inaccessible tweets, and providing multiple subsets of the collection for better availability. Applying our methodology on Arabic tweets resulted in EveTAR , the first freely-available tweet test collection for multiple IR tasks. EveTAR includes a crawl of 355M Arabic tweets and covers 50 significant events for which about 62K tweets were judged with substantial average inter-annotator agreement (Kappa value of 0.71). We demonstrate the usability of EveTAR by evaluating existing algorithms in the respective tasks. Results indicate that the new collection can support reliable ranking of IR systems that is comparable to similar TREC collections, while providing strong baseline results for future studies over Arabic tweets

    Pushing the Boundaries of Crowd-enabled Databases with Query-driven Schema Expansion

    Full text link
    By incorporating human workers into the query execution process crowd-enabled databases facilitate intelligent, social capabilities like completing missing data at query time or performing cognitive operators. But despite all their flexibility, crowd-enabled databases still maintain rigid schemas. In this paper, we extend crowd-enabled databases by flexible query-driven schema expansion, allowing the addition of new attributes to the database at query time. However, the number of crowd-sourced mini-tasks to fill in missing values may often be prohibitively large and the resulting data quality is doubtful. Instead of simple crowd-sourcing to obtain all values individually, we leverage the user-generated data found in the Social Web: By exploiting user ratings we build perceptual spaces, i.e., highly-compressed representations of opinions, impressions, and perceptions of large numbers of users. Using few training samples obtained by expert crowd sourcing, we then can extract all missing data automatically from the perceptual space with high quality and at low costs. Extensive experiments show that our approach can boost both performance and quality of crowd-enabled databases, while also providing the flexibility to expand schemas in a query-driven fashion.Comment: VLDB201

    Human-Assisted Graph Search: It's Okay to Ask Questions

    Full text link
    We consider the problem of human-assisted graph search: given a directed acyclic graph with some (unknown) target node(s), we consider the problem of finding the target node(s) by asking an omniscient human questions of the form "Is there a target node that is reachable from the current node?". This general problem has applications in many domains that can utilize human intelligence, including curation of hierarchies, debugging workflows, image segmentation and categorization, interactive search and filter synthesis. To our knowledge, this work provides the first formal algorithmic study of the optimization of human computation for this problem. We study various dimensions of the problem space, providing algorithms and complexity results. Our framework and algorithms can be used in the design of an optimizer for crowd-sourcing platforms such as Mechanical Turk.Comment: VLDB201

    3D BAT: A Semi-Automatic, Web-based 3D Annotation Toolbox for Full-Surround, Multi-Modal Data Streams

    Full text link
    In this paper, we focus on obtaining 2D and 3D labels, as well as track IDs for objects on the road with the help of a novel 3D Bounding Box Annotation Toolbox (3D BAT). Our open source, web-based 3D BAT incorporates several smart features to improve usability and efficiency. For instance, this annotation toolbox supports semi-automatic labeling of tracks using interpolation, which is vital for downstream tasks like tracking, motion planning and motion prediction. Moreover, annotations for all camera images are automatically obtained by projecting annotations from 3D space into the image domain. In addition to the raw image and point cloud feeds, a Masterview consisting of the top view (bird's-eye-view), side view and front views is made available to observe objects of interest from different perspectives. Comparisons of our method with other publicly available annotation tools reveal that 3D annotations can be obtained faster and more efficiently by using our toolbox

    Visualisation of semantic enrichment

    Get PDF
    Automatically creating semantic enrichments for text may lead to annotations that allow for excellent recall but poor precision. Manual enrichment is potentially more targeted, leading to greater precision. We aim to support nonexperts in manually enriching texts with semantic annotations. Neither the visualisation of semantic enrichment nor the process of manually enriching texts has been evaluated before. This paper presents the results of our user study on visualisation of text enrichment during the annotation process. We performed extensive analysis of work related to the visualisation of semantic annotations. In a prototype implementation, we then explored two layout alternatives for visualising semantic annotations and their linkage to the text atoms. Here we summarise and discuss our results and their design implications for tools creating semantic annotations

    Satyam: Democratizing Groundtruth for Machine Vision

    Full text link
    The democratization of machine learning (ML) has led to ML-based machine vision systems for autonomous driving, traffic monitoring, and video surveillance. However, true democratization cannot be achieved without greatly simplifying the process of collecting groundtruth for training and testing these systems. This groundtruth collection is necessary to ensure good performance under varying conditions. In this paper, we present the design and evaluation of Satyam, a first-of-its-kind system that enables a layperson to launch groundtruth collection tasks for machine vision with minimal effort. Satyam leverages a crowdtasking platform, Amazon Mechanical Turk, and automates several challenging aspects of groundtruth collection: creating and launching of custom web-UI tasks for obtaining the desired groundtruth, controlling result quality in the face of spammers and untrained workers, adapting prices to match task complexity, filtering spammers and workers with poor performance, and processing worker payments. We validate Satyam using several popular benchmark vision datasets, and demonstrate that groundtruth obtained by Satyam is comparable to that obtained from trained experts and provides matching ML performance when used for training
    • …
    corecore