954 research outputs found
Human-in-the-Loop Learning From Crowdsourcing and Social Media
Computational social studies using public social media data have become more and more popular because of the large amount of user-generated data available. The richness of social media data, coupled with noise and subjectivity, raise significant challenges for computationally studying social issues in a feasible and scalable manner. Machine learning problems are, as a result, often subjective or ambiguous when humans are involved. That is, humans solving the same problems might come to legitimate but completely different conclusions, based on their personal experiences and beliefs. When building supervised learning models, particularly when using crowdsourced training data, multiple annotations per data item are usually reduced to a single label representing ground truth. This inevitably hides a rich source of diversity and subjectivity of opinions about the labels.
Label distribution learning associates for each data item a probability distribution over the labels for that item, thus it can preserve diversities of opinions, beliefs, etc. that conventional learning hides or ignores. We propose a humans-in-the-loop learning framework to model and study large volumes of unlabeled subjective social media data with less human effort. We study various annotation tasks given to crowdsourced annotators and methods for aggregating their contributions in a manner that preserves subjectivity and disagreement. We introduce a strategy for learning label distributions with only five-to-ten labels per item by aggregating human-annotated labels over multiple, semantically related data items. We conduct experiments using our learning framework on data related to two subjective social issues (work and employment, and suicide prevention) that touch many people worldwide. Our methods can be applied to a broad variety of problems, particularly social problems. Our experimental results suggest that specific label aggregation methods can help provide reliable representative semantics at the population level
Accurator: Nichesourcing for Cultural Heritage
With more and more cultural heritage data being published online, their
usefulness in this open context depends on the quality and diversity of
descriptive metadata for collection objects. In many cases, existing metadata
is not adequate for a variety of retrieval and research tasks and more specific
annotations are necessary. However, eliciting such annotations is a challenge
since it often requires domain-specific knowledge. Where crowdsourcing can be
successfully used for eliciting simple annotations, identifying people with the
required expertise might prove troublesome for tasks requiring more complex or
domain-specific knowledge. Nichesourcing addresses this problem, by tapping
into the expert knowledge available in niche communities. This paper presents
Accurator, a methodology for conducting nichesourcing campaigns for cultural
heritage institutions, by addressing communities, organizing events and
tailoring a web-based annotation tool to a domain of choice. The contribution
of this paper is threefold: 1) a nichesourcing methodology, 2) an annotation
tool for experts and 3) validation of the methodology and tool in three case
studies. The three domains of the case studies are birds on art, bible prints
and fashion images. We compare the quality and quantity of obtained annotations
in the three case studies, showing that the nichesourcing methodology in
combination with the image annotation tool can be used to collect high quality
annotations in a variety of domains and annotation tasks. A user evaluation
indicates the tool is suited and usable for domain specific annotation tasks
Loud and Trendy: Crowdsourcing Impressions of Social Ambiance in Popular Indoor Urban Places
New research cutting across architecture, urban studies, and psychology is
contextualizing the understanding of urban spaces according to the perceptions
of their inhabitants. One fundamental construct that relates place and
experience is ambiance, which is defined as "the mood or feeling associated
with a particular place". We posit that the systematic study of ambiance
dimensions in cities is a new domain for which multimedia research can make
pivotal contributions. We present a study to examine how images collected from
social media can be used for the crowdsourced characterization of indoor
ambiance impressions in popular urban places. We design a crowdsourcing
framework to understand suitability of social images as data source to convey
place ambiance, to examine what type of images are most suitable to describe
ambiance, and to assess how people perceive places socially from the
perspective of ambiance along 13 dimensions. Our study is based on 50,000
Foursquare images collected from 300 popular places across six cities
worldwide. The results show that reliable estimates of ambiance can be obtained
for several of the dimensions. Furthermore, we found that most aggregate
impressions of ambiance are similar across popular places in all studied
cities. We conclude by presenting a multidisciplinary research agenda for
future research in this domain
- …