6 research outputs found
Migration as Submodular Optimization
Migration presents sweeping societal challenges that have recently attracted
significant attention from the scientific community. One of the prominent
approaches that have been suggested employs optimization and machine learning
to match migrants to localities in a way that maximizes the expected number of
migrants who find employment. However, it relies on a strong additivity
assumption that, we argue, does not hold in practice, due to competition
effects; we propose to enhance the data-driven approach by explicitly
optimizing for these effects. Specifically, we cast our problem as the
maximization of an approximately submodular function subject to matroid
constraints, and prove that the worst-case guarantees given by the classic
greedy algorithm extend to this setting. We then present three different models
for competition effects, and show that they all give rise to submodular
objectives. Finally, we demonstrate via simulations that our approach leads to
significant gains across the board.Comment: Simulation code is available at https://github.com/pgoelz/migration
Using microtasks to crowdsource DBpedia entity classification: A study in workflow design
DBpedia is at the core of the Linked Open Data Cloud and widely used in research and applications. However, it is far from being perfect. Its content suffers from many flaws, as a result of factual errors inherited from Wikipedia or incomplete mappings from Wikipedia infobox to DBpedia ontology. In this work we focus on one class of such problems, un-typed entities. We propose a hierarchical tree-based approach to categorize DBpedia entities according to the DBpedia ontology using human computation and paid microtasks. We analyse the main dimensions of the crowdsourcing exercise in depth in order to come up with suggestions for workflow design and study three different workflows with automatic and hybrid prediction mechanisms to select possible candidates for the most specific category from the DBpedia ontology. To test our approach, we run experiments on CrowdFlower using a gold standard dataset of 120 previously unclassified entities. In our studies human-computation driven approaches generally achieved higher precision at lower cost when compared to workflows with automatic predictors. However, each of the tested workflows has its merit and none of them seems to perform exceptionally well on the entities that the DBpedia Extraction Framework fails to classify. We discuss these findings and their potential implications for the design of effective crowdsourced entity classification in DBpedia and beyond
GPTs are GPTs: An Early Look at the Labor Market Impact Potential of Large Language Models
We investigate the potential implications of large language models (LLMs),
such as Generative Pre-trained Transformers (GPTs), on the U.S. labor market,
focusing on the increased capabilities arising from LLM-powered software
compared to LLMs on their own. Using a new rubric, we assess occupations based
on their alignment with LLM capabilities, integrating both human expertise and
GPT-4 classifications. Our findings reveal that around 80% of the U.S.
workforce could have at least 10% of their work tasks affected by the
introduction of LLMs, while approximately 19% of workers may see at least 50%
of their tasks impacted. We do not make predictions about the development or
adoption timeline of such LLMs. The projected effects span all wage levels,
with higher-income jobs potentially facing greater exposure to LLM capabilities
and LLM-powered software. Significantly, these impacts are not restricted to
industries with higher recent productivity growth. Our analysis suggests that,
with access to an LLM, about 15% of all worker tasks in the US could be
completed significantly faster at the same level of quality. When incorporating
software and tooling built on top of LLMs, this share increases to between 47
and 56% of all tasks. This finding implies that LLM-powered software will have
a substantial effect on scaling the economic impacts of the underlying models.
We conclude that LLMs such as GPTs exhibit traits of general-purpose
technologies, indicating that they could have considerable economic, social,
and policy implications
Generalized task markets for human and machine computation
We discuss challenges and opportunities for developing generalized task markets where human and machine intelligence are enlisted to solve problems, based on a consideration of the competencies, availabilities, and pricing of different problemsolving resources. The approach couples human computation with machine learning and planning, and is aimed at optimizing the flow of subtasks to people and to computational problem solvers. We illustrate key ideas in the context of Lingua Mechanica, a project focused on harnessing human and machine translation skills to perform translation among languages. We present infrastructure and methods for enlisting and guiding human and machine computation for language translation, including details about the hardness of generating plans for assigning tasks to solvers. Finally, we discuss studies performed with machine and human solvers, focusing on components of a Lingua Mechanica prototype.