25,566 research outputs found
Optimization in Knowledge-Intensive Crowdsourcing
We present SmartCrowd, a framework for optimizing collaborative
knowledge-intensive crowdsourcing. SmartCrowd distinguishes itself by
accounting for human factors in the process of assigning tasks to workers.
Human factors designate workers' expertise in different skills, their expected
minimum wage, and their availability. In SmartCrowd, we formulate task
assignment as an optimization problem, and rely on pre-indexing workers and
maintaining the indexes adaptively, in such a way that the task assignment
process gets optimized both qualitatively, and computation time-wise. We
present rigorous theoretical analyses of the optimization problem and propose
optimal and approximation algorithms. We finally perform extensive performance
and quality experiments using real and synthetic data to demonstrate that
adaptive indexing in SmartCrowd is necessary to achieve efficient high quality
task assignment.Comment: 12 page
Training Generative Question-Answering on Synthetic Data Obtained from an Instruct-tuned Model
This paper presents a simple and cost-effective method for synthesizing data
to train question-answering systems. For training, fine-tuning GPT models is a
common practice in resource-rich languages like English, however, it becomes
challenging for non-English languages due to the scarcity of sufficient
question-answer (QA) pairs. Existing approaches use question and answer
generators trained on human-authored QA pairs, which involves substantial human
expenses. In contrast, we use an instruct-tuned model to generate QA pairs in a
zero-shot or few-shot manner. We conduct experiments to compare various
strategies for obtaining QA pairs from the instruct-tuned model. The results
demonstrate that a model trained on our proposed synthetic data achieves
comparable performance to a model trained on manually curated datasets, without
incurring human costs.Comment: PACLIC 2023 short paper, 4 pages (6 pages including references), 4
figure
Building English-to-Serbian machine translation system for IMDb movie reviews
This paper reports the results of the first experiment dealing with the challenges of building a machine translation system for user-generated content involving a complex South Slavic language. We focus on translation of English IMDb user movie reviews into Serbian, in a low-resource scenario. We explore potentials and limits of (i) phrase-based and neural machine translation systems trained on out-of-domain clean parallel data from news articles (ii) creating additional synthetic in-domain parallel corpus by machine-translating the English IMDb corpus into Serbian. Our main findings are that morphology and syntax are better handled by the neural approach than by the phrase-based approach even in this low-resource mismatched domain scenario, however the situation is different for the lexical aspect, especially for person names. This finding also indicates that in general, machine translation of person names into Slavic languages (especially those which require/allow transcription) should be investigated more systematically
Who Is the Note-Worthy Fan? Featuring Players in the Official Facebook Communication of Mainstream Video Games
Video game fans participate in the official promotion of video games, either voluntarily, or unwillingly when their fanworks are appropriated and used by video game publishers. The article provides a quantitative overview of the presence of fans in the official social media profiles of four selected mainstream games (Dragon Age: Inquistion, Evolve, Mortal Kombat X and The Witcher 3: Wild Hunt) during a one-year period from August 2014 to July 2015. Combining the traditional method of content analysis and Facebook data-mining, we explore the frequency with which fans appear in social media (including questions of various forms of fanworks and gender) and what user activity is generated by posts featuring fans and fan creations. Results show that fans or their fanworks are featured in 8–24% of all posts depending on a game and in the most common categories of painting and cosplay they generate a comparable level of user engagement as traditional promotional posts
I Can't Believe There's No Images! Learning Visual Tasks Using only Language Supervision
Many high-level skills that are required for computer vision tasks, such as
parsing questions, comparing and contrasting semantics, and writing
descriptions, are also required in other domains such as natural language
processing. In this paper, we ask whether it is possible to learn those skills
from text data and then transfer them to vision tasks without ever training on
visual training data. Key to our approach is exploiting the joint embedding
space of contrastively trained vision and language encoders. In practice, there
can be systematic differences between embedding spaces for different modalities
in contrastive models, and we analyze how these differences affect our approach
and study strategies to mitigate this concern. We produce models using only
text training data on four representative tasks: image captioning, visual
entailment, visual question answering and visual news captioning, and evaluate
them on standard benchmarks using images. We find these models perform close to
models trained on images, while surpassing prior work for captioning and visual
entailment in this text-only setting by over 9 points, and outperforming all
prior work on visual news by over 30 points. We also showcase a variety of
stylistic image captioning models that are trained using no image data and no
human-curated language data, but instead using readily-available text data from
books, the web, or language models.Comment: website (https://prior.allenai.org/projects/close), code
(https://github.com/allenai/close
VERITE: A Robust Benchmark for Multimodal Misinformation Detection Accounting for Unimodal Bias
Multimedia content has become ubiquitous on social media platforms, leading
to the rise of multimodal misinformation (MM) and the urgent need for effective
strategies to detect and prevent its spread. In recent years, the challenge of
multimodal misinformation detection (MMD) has garnered significant attention by
researchers and has mainly involved the creation of annotated, weakly
annotated, or synthetically generated training datasets, along with the
development of various deep learning MMD models. However, the problem of
unimodal bias in MMD benchmarks -- where biased or unimodal methods outperform
their multimodal counterparts on an inherently multimodal task -- has been
overlooked. In this study, we systematically investigate and identify the
presence of unimodal bias in widely-used MMD benchmarks (VMU-Twitter, COSMOS),
raising concerns about their suitability for reliable evaluation. To address
this issue, we introduce the "VERification of Image-TExtpairs" (VERITE)
benchmark for MMD which incorporates real-world data, excludes "asymmetric
multimodal misinformation" and utilizes "modality balancing". We conduct an
extensive comparative study with a Transformer-based architecture that shows
the ability of VERITE to effectively address unimodal bias, rendering it a
robust evaluation framework for MMD. Furthermore, we introduce a new method --
termed Crossmodal HArd Synthetic MisAlignment (CHASMA) -- for generating
realistic synthetic training data that preserve crossmodal relations between
legitimate images and false human-written captions. By leveraging CHASMA in the
training process, we observe consistent and notable improvements in predictive
performance on VERITE; with a 9.2% increase in accuracy. We release our code
at: https://github.com/stevejpapad/image-text-verificatio
- …