7 research outputs found

    Semantically Enriched Task and Workflow Automation in Crowdsourcing for Linked Data Management

    No full text
    Crowdsourcing is one of the new emerging paradigms to exploit the notion of human-computation for harvesting and processing complex heterogenous data to produce insight and actionable knowledge. Crowdsourcing is task-oriented, and hence specification and management of not only tasks, but also workflows should play a critical role. Crowdsourcing research can still be considered in its infancy. Significant need is felt for crowdsourcing applications to be equipped with well defined task and workflow specifications ranging from simple human-intelligent tasks to more sophisticated and cooperative tasks to handle data and control-flow among these tasks. Addressing this need, we have attempted to devise a generic, flexible and extensible task specification and workflow management mechanism in crowdsourcing. We have contextualized this problem to linked data management as our domain of interest. More specifically, we develop CrowdLink, which utilizes an architecture for automated task specification, generation, publishing and reviewing to engage crowdworkers for verification and creation of triples in the Linked Open Data (LOD) cloud. The LOD incorporates various core data sets in the semantic web, yet is not in full conformance with the guidelines for publishing high quality linked data on the web. Our approach is not only useful in efficiently processing the LOD management tasks, it can also help in enriching and improving quality of mission-critical links in the LOD. We demonstrate usefulness of our approach through various link creation and verification tasks, and workflows using Amazon Mechanical Turk. Experimental evaluation demonstrates promising results not only in terms of ease of task generation, publishing and reviewing, but also in terms of accuracy of the links created, and verified by the crowdworkers. © 2014 World Scientific Publishing Company.Many tasks involving text-mining are still best accomplished by humans, and we believe that crowdsourcing can leverage collective human abilities in such tasks. As an example, let us consider text-mining tasks in bio-medical literature. Researchers in this area enjoy access to full-text of articles published in numerous journals with on-line access. PubMed Central, created by the U.S. National Institutes of Health's National Library of Medicine, currently o®ers over 3.1 million full-text scienti¯c articles, all free of charge

    smartAPI: Towards a more intelligent network of Web APIs

    No full text
    Data science increasingly employs cloud-based web application programming interfaces (apis). However, automatically discovering and connecting suitable apis for a given application is difficult due to the lack of explicit knowledge about the structure and datatypes of web api inputs and outputs. To address this challenge, we conducted a survey to identify the metadata elements that are crucial to the description of web apis and subsequently developed the smartapi metadata specification and associated tools to capture their domain-related and structural characteristics using the fair (findable, accessible, interoperable, reusable) principles. This paper presents the results of the survey, provides an overview of the smartapi specification and a reference implementation, and discusses use cases of smartapi. We show that annotating apis with smartapi metadata is straightforward through an extension of the existing swagger editor. By facilitating the creation of such metadata, we increase the automated interoperability of web apis. This work is done as part of the nih commons big data to knowledge (bd2k) api interoperability working group
    corecore