375 research outputs found
Affective colormap design for accurate visual comprehension in industrial tomography
The design of colormaps can help tomography operators obtain accurate visual compre-hension, thereby assisting safety-critical decisions. The research presented here is about deploying colormaps that promote the best affective responses for industrial microwave tomography (MWT). To answer the two research questions related to our study, we firstly conducted a quantitative analysis of 11 frequently-used colormaps on a segmentation task. Secondly, we presented the same colormaps within a crowdsourced study comprising two parts to verify the quantitative outcomes. The first part encoded affective responses from participants into a prevailing four-quadrant valence–arousal grid; the second part recorded participant ratings towards the accuracy of each colormap on MWT segmentation. We concluded that three colormaps are the best suited in the context of MWT tasks. We also found that the colormaps triggering emotions in the positive–exciting quadrant can facilitate more accurate visual comprehension than other affect-related quadrants. A synthetic colormap design guideline was consequently proposed
Towards Best Experiment Design for Evaluating Dialogue System Output
To overcome the limitations of automated metrics (e.g. BLEU, METEOR) for
evaluating dialogue systems, researchers typically use human judgments to
provide convergent evidence. While it has been demonstrated that human
judgments can suffer from the inconsistency of ratings, extant research has
also found that the design of the evaluation task affects the consistency and
quality of human judgments. We conduct a between-subjects study to understand
the impact of four experiment conditions on human ratings of dialogue system
output. In addition to discrete and continuous scale ratings, we also
experiment with a novel application of Best-Worst scaling to dialogue
evaluation. Through our systematic study with 40 crowdsourced workers in each
task, we find that using continuous scales achieves more consistent ratings
than Likert scale or ranking-based experiment design. Additionally, we find
that factors such as time taken to complete the task and no prior experience of
participating in similar studies of rating dialogue system output positively
impact consistency and agreement amongst ratersComment: Accepted at INLG 201
Modelling affect for horror soundscapes
The feeling of horror within movies or games relies on the audience’s perception of a tense atmosphere — often achieved
through sound accompanied by the on-screen drama — guiding its emotional experience throughout the scene or game-play
sequence. These progressions are often crafted through an a priori knowledge of how a scene or game-play sequence will playout, and
the intended emotional patterns a game director wants to transmit. The appropriate design of sound becomes even more challenging
once the scenery and the general context is autonomously generated by an algorithm. Towards realizing sound-based affective
interaction in games this paper explores the creation of computational models capable of ranking short audio pieces based on
crowdsourced annotations of tension, arousal and valence. Affect models are trained via preference learning on over a thousand
annotations with the use of support vector machines, whose inputs are low-level features extracted from the audio assets of a
comprehensive sound library. The models constructed in this work are able to predict the tension, arousal and valence elicited by
sound, respectively, with an accuracy of approximately 65%, 66% and 72%.peer-reviewe
Analysis of Crowdsourced Sampling Strategies for HodgeRank with Sparse Random Graphs
Crowdsourcing platforms are now extensively used for conducting subjective
pairwise comparison studies. In this setting, a pairwise comparison dataset is
typically gathered via random sampling, either \emph{with} or \emph{without}
replacement. In this paper, we use tools from random graph theory to analyze
these two random sampling methods for the HodgeRank estimator. Using the
Fiedler value of the graph as a measurement for estimator stability
(informativeness), we provide a new estimate of the Fiedler value for these two
random graph models. In the asymptotic limit as the number of vertices tends to
infinity, we prove the validity of the estimate. Based on our findings, for a
small number of items to be compared, we recommend a two-stage sampling
strategy where a greedy sampling method is used initially and random sampling
\emph{without} replacement is used in the second stage. When a large number of
items is to be compared, we recommend random sampling with replacement as this
is computationally inexpensive and trivially parallelizable. Experiments on
synthetic and real-world datasets support our analysis
The MediaEval 2016 Emotional Impact of Movies Task
Volume: 1739 Host publication title: MediaEval 2016 Multimedia Benchmark Workshop Host publication sub-title: Working Notes Proceedings of the MediaEval 2016 WorkshopNon peer reviewe
Loud and Trendy: Crowdsourcing Impressions of Social Ambiance in Popular Indoor Urban Places
New research cutting across architecture, urban studies, and psychology is
contextualizing the understanding of urban spaces according to the perceptions
of their inhabitants. One fundamental construct that relates place and
experience is ambiance, which is defined as "the mood or feeling associated
with a particular place". We posit that the systematic study of ambiance
dimensions in cities is a new domain for which multimedia research can make
pivotal contributions. We present a study to examine how images collected from
social media can be used for the crowdsourced characterization of indoor
ambiance impressions in popular urban places. We design a crowdsourcing
framework to understand suitability of social images as data source to convey
place ambiance, to examine what type of images are most suitable to describe
ambiance, and to assess how people perceive places socially from the
perspective of ambiance along 13 dimensions. Our study is based on 50,000
Foursquare images collected from 300 popular places across six cities
worldwide. The results show that reliable estimates of ambiance can be obtained
for several of the dimensions. Furthermore, we found that most aggregate
impressions of ambiance are similar across popular places in all studied
cities. We conclude by presenting a multidisciplinary research agenda for
future research in this domain
Worker Retention, Response Quality, and Diversity in Microtask Crowdsourcing: An Experimental Investigation of the Potential for Priming Effects to Promote Project Goals
Online microtask crowdsourcing platforms act as efficient resources for delegating small units of work, gathering data, generating ideas, and more. Members of research and business communities have incorporated crowdsourcing into problem-solving processes. When human workers contribute to a crowdsourcing task, they are subject to various stimuli as a result of task design. Inter-task priming effects - through which work is nonconsciously, yet significantly, influenced by exposure to certain stimuli - have been shown to affect microtask crowdsourcing responses in a variety of ways. Instead of simply being wary of the potential for priming effects to skew results, task administrators can utilize proven priming procedures in order to promote project goals. In a series of three experiments conducted on Amazon’s Mechanical Turk, we investigated the effects of proposed priming treatments on worker retention, response quality, and response diversity. In our first two experiments, we studied the effect of initial response freedom on sustained worker participation and response quality. We expected that workers who were granted greater levels of freedom in an initial response would be stimulated to complete more work and deliver higher quality work than workers originally constrained in their initial response possibilities. We found no significant relationship between the initial response freedom granted to workers and the amount of optional work they completed. The degree of initial response freedom also did not have a significant impact on subsequent response quality. However, the influence of inter-task effects were evident based on response tendencies for different question types. We found evidence that consistency in task structure may play a stronger role in promoting response quality than proposed priming procedures. In our final experiment, we studied the influence of a group-level priming treatment on response diversity. Instead of varying task structure for different workers, we varied the degree of overlap in question content distributed to different workers in a group. We expected groups of workers that were exposed to more diverse preliminary question sets to offer greater diversity in response to a subsequent question. Although differences in response diversity were revealed, no consistent trend between question content overlap and response diversity prevailed. Nevertheless, combining consistent task structure with crowd-level priming procedures - to encourage diversity in inter-task effects across the crowd - offers an exciting path for future study
The Use of Online Panel Data in Management Research: A Review and Recommendations
Management scholars have long depended on convenience samples to conduct research involving human participants. However, the past decade has seen an emergence of a new convenience sample: online panels and online panel participants. The data these participants provide—online panel data (OPD)—has been embraced by many management scholars owing to the numerous benefits it provides over “traditional” convenience samples. Despite those advantages, OPD has not been warmly received by all. Currently, there is a divide in the field over the appropriateness of OPD in management scholarship. Our review takes aim at the divide with the goal of providing a common understanding of OPD and its utility and providing recommendations regarding when and how to use OPD and how and where to publish it. To accomplish these goals, we inventoried and reviewed OPD use across 13 management journals spanning 2006 to 2017. Our search resulted in 804 OPD-based studies across 439 articles. Notably, our search also identified 26 online panel platforms (“brokers”) used to connect researchers with online panel participants. Importantly, we offer specific guidance to authors, reviewers, and editors, having implications for both micro and macro management scholars
An Image Is Worth More than a Thousand Favorites: Surfacing the Hidden Beauty of Flickr Pictures
The dynamics of attention in social media tend to obey power laws. Attention
concentrates on a relatively small number of popular items and neglecting the
vast majority of content produced by the crowd. Although popularity can be an
indication of the perceived value of an item within its community, previous
research has hinted to the fact that popularity is distinct from intrinsic
quality. As a result, content with low visibility but high quality lurks in the
tail of the popularity distribution. This phenomenon can be particularly
evident in the case of photo-sharing communities, where valuable photographers
who are not highly engaged in online social interactions contribute with
high-quality pictures that remain unseen. We propose to use a computer vision
method to surface beautiful pictures from the immense pool of
near-zero-popularity items, and we test it on a large dataset of
creative-commons photos on Flickr. By gathering a large crowdsourced ground
truth of aesthetics scores for Flickr images, we show that our method retrieves
photos whose median perceived beauty score is equal to the most popular ones,
and whose average is lower by only 1.5%.Comment: ICWSM 201
- …