37,649 research outputs found

    The Challenges of Knowledge Combination in ML-based Crowdsourcing – The ODF Killer Shrimp Challenge using ML and Kaggle

    Get PDF
    Organizations are increasingly using digital technologies, such as crowdsourcing platforms and machine learning, to tackle innovation challenges. These technologies often require the combination of heterogeneous technical and domain-specific knowledge from diverse actors to achieve the organization’s innovation goals. While research has focused on knowledge combination for relatively simple tasks on crowdsourcing platforms and within ML-based innovation, we know little about how knowledge is combined in emerging innovation approaches incorporating ML and crowdsourcing to solve domain-specific innovation challenges. Thus, this paper investigates the following: What are the challenges to knowledge combination in domain-specific ML-based crowdsourcing? We conducted a case study of an environmental challenge – how to use ML to predict the spread of a marine invasive species, led by the Swedish consortium, Ocean Data Factory Sweden using the crowdsourcing platform Kaggle. After discussing our results, we end the paper with recommendations on how to integrate crowdsourcing into domain-specific digital innovation processes

    Are all ‘research fields’ equal? Rethinking practice for the use of data from crowd-sourcing market places

    Get PDF
    New technologies like large-scale social media sides (e.g., Facebook and Twitter) and crowdsourcing services (e.g., Amazon Mechanical Turk, Crowdflower, Clickworker) impact social science research and provide many new and interesting avenues for research. The use of these new technologies for research has not been without challenges and a recently published psychological study on Facebook led to a widespread discussion on the ethics of conducting large-scale experiments online. Surprisingly little has been said about the ethics of conducting research using commercial crowdsourcing market places. In this paper, I want to focus on the question of which ethical questions are raised by data collection with crowdsourcing tools. I briefly draw on implications of internet research more generally and then focus on the specific challenges that research with crowdsourcing tools faces. I identify fair-pay and related issues of respect for autonomy as well as problems with power dynamics between researcher and participant, which has implications for ‘withdrawal-withoutprejudice’, as the major ethical challenges with crowdsourced data. Further, I will to draw attention on how we can develop a ‘best practice’ for researchers using crowdsourcing tools

    Creating a data collection for evaluating rich speech retrieval

    Get PDF
    We describe the development of a test collection for the investigation of speech retrieval beyond identification of relevant content. This collection focuses on satisfying user information needs for queries associated with specific types of speech acts. The collection is based on an archive of the Internet video from Internet video sharing platform (blip.tv), and was provided by the MediaEval benchmarking initiative. A crowdsourcing approach was used to identify segments in the video data which contain speech acts, to create a description of the video containing the act and to generate search queries designed to refind this speech act. We describe and reflect on our experiences with crowdsourcing this test collection using the Amazon Mechanical Turk platform. We highlight the challenges of constructing this dataset, including the selection of the data source, design of the crowdsouring task and the specification of queries and relevant items

    Formal Design Concept And Participant Behavior Analysis For Crowdsourcing Design

    Get PDF
    Crowdsourcing has emerged as a new design resource for conceptual design process and multiple crowdsourcing services provide an opportunity for design idea collection and concept generation by crowds. However, few formal methods are available to extract and evaluate design concepts from the activities of the design crowd. Scarcity of information and non-guaranteed quality of contributions are often challenges to be tackled. To overcome the challenges, the research aims to answer how a system systematically extracts and represents the explicit or implicit hidden design concepts from crowdsourcing design activities and how crowdsourcing design activities of participants are captured as design information to develop a product in crowdsourcing platform in the perspectives of process and elements. This research provides taxonomy of design features to represent crowdsourcing design activities. With the taxonomy, a formal concept analysis method, Galois lattices, is applied to evaluate activities of design crowd and to extract possible design concepts. Using this approach, the crowd activities are represented with design features and participant information and it allows modeling the potential design concepts with the contributions of participants. Two participant evaluating measures, Participant Individual Score and Participant Group Score, are proposed to enhance the extracted design concepts with participants\u27 information. By employing the proposed scores and design features, this research figure out the significance of participants\u27 behavior in crowdsourcing design. In addition, a formal method to represent the processes and elements in crowdsourcing design activities with the theory adopted from social science, Actor Network Theory. The presented method and metrics are validated with a real design data collected from a crowdsourcing service by focus group interview and precision and recall tests

    Citizen surveillance for environmental monitoring:combining the efforts of citizen science and crowdsourcing in a quantitative data framework

    Get PDF
    Citizen science and crowdsourcing have been emerging as methods to collect data for surveillance and/or monitoring activities. They could be gathered under the overarching term citizen surveillance. The discipline, however, still struggles to be widely accepted in the scientific community, mainly because these activities are not embedded in a quantitative framework. This results in an ongoing discussion on how to analyze and make useful inference from these data. When considering the data collection process, we illustrate how citizen surveillance can be classified according to the nature of the underlying observation process measured in two dimensions—the degree of observer reporting intention and the control in observer detection effort. By classifying the observation process in these dimensions we distinguish between crowdsourcing, unstructured citizen science and structured citizen science. This classification helps the determine data processing and statistical treatment of these data for making inference. Using our framework, it is apparent that published studies are overwhelmingly associated with structured citizen science, and there are well developed statistical methods for the resulting data. In contrast, methods for making useful inference from purely crowd-sourced data remain under development, with the challenges of accounting for the unknown observation process considerable. Our quantitative framework for citizen surveillance calls for an integration of citizen science and crowdsourcing and provides a way forward to solve the statistical challenges inherent to citizen-sourced data
    • 

    corecore