954 research outputs found

    Human-in-the-Loop Learning From Crowdsourcing and Social Media

    Get PDF
    Computational social studies using public social media data have become more and more popular because of the large amount of user-generated data available. The richness of social media data, coupled with noise and subjectivity, raise significant challenges for computationally studying social issues in a feasible and scalable manner. Machine learning problems are, as a result, often subjective or ambiguous when humans are involved. That is, humans solving the same problems might come to legitimate but completely different conclusions, based on their personal experiences and beliefs. When building supervised learning models, particularly when using crowdsourced training data, multiple annotations per data item are usually reduced to a single label representing ground truth. This inevitably hides a rich source of diversity and subjectivity of opinions about the labels. Label distribution learning associates for each data item a probability distribution over the labels for that item, thus it can preserve diversities of opinions, beliefs, etc. that conventional learning hides or ignores. We propose a humans-in-the-loop learning framework to model and study large volumes of unlabeled subjective social media data with less human effort. We study various annotation tasks given to crowdsourced annotators and methods for aggregating their contributions in a manner that preserves subjectivity and disagreement. We introduce a strategy for learning label distributions with only five-to-ten labels per item by aggregating human-annotated labels over multiple, semantically related data items. We conduct experiments using our learning framework on data related to two subjective social issues (work and employment, and suicide prevention) that touch many people worldwide. Our methods can be applied to a broad variety of problems, particularly social problems. Our experimental results suggest that specific label aggregation methods can help provide reliable representative semantics at the population level

    Accurator: Nichesourcing for Cultural Heritage

    Full text link
    With more and more cultural heritage data being published online, their usefulness in this open context depends on the quality and diversity of descriptive metadata for collection objects. In many cases, existing metadata is not adequate for a variety of retrieval and research tasks and more specific annotations are necessary. However, eliciting such annotations is a challenge since it often requires domain-specific knowledge. Where crowdsourcing can be successfully used for eliciting simple annotations, identifying people with the required expertise might prove troublesome for tasks requiring more complex or domain-specific knowledge. Nichesourcing addresses this problem, by tapping into the expert knowledge available in niche communities. This paper presents Accurator, a methodology for conducting nichesourcing campaigns for cultural heritage institutions, by addressing communities, organizing events and tailoring a web-based annotation tool to a domain of choice. The contribution of this paper is threefold: 1) a nichesourcing methodology, 2) an annotation tool for experts and 3) validation of the methodology and tool in three case studies. The three domains of the case studies are birds on art, bible prints and fashion images. We compare the quality and quantity of obtained annotations in the three case studies, showing that the nichesourcing methodology in combination with the image annotation tool can be used to collect high quality annotations in a variety of domains and annotation tasks. A user evaluation indicates the tool is suited and usable for domain specific annotation tasks

    Loud and Trendy: Crowdsourcing Impressions of Social Ambiance in Popular Indoor Urban Places

    Get PDF
    New research cutting across architecture, urban studies, and psychology is contextualizing the understanding of urban spaces according to the perceptions of their inhabitants. One fundamental construct that relates place and experience is ambiance, which is defined as "the mood or feeling associated with a particular place". We posit that the systematic study of ambiance dimensions in cities is a new domain for which multimedia research can make pivotal contributions. We present a study to examine how images collected from social media can be used for the crowdsourced characterization of indoor ambiance impressions in popular urban places. We design a crowdsourcing framework to understand suitability of social images as data source to convey place ambiance, to examine what type of images are most suitable to describe ambiance, and to assess how people perceive places socially from the perspective of ambiance along 13 dimensions. Our study is based on 50,000 Foursquare images collected from 300 popular places across six cities worldwide. The results show that reliable estimates of ambiance can be obtained for several of the dimensions. Furthermore, we found that most aggregate impressions of ambiance are similar across popular places in all studied cities. We conclude by presenting a multidisciplinary research agenda for future research in this domain

    When in doubt ask the crowd : leveraging collective intelligence for improving event detection and machine learning

    Get PDF
    [no abstract
    • …
    corecore