22,352 research outputs found
Shrinkage Estimators in Online Experiments
We develop and analyze empirical Bayes Stein-type estimators for use in the
estimation of causal effects in large-scale online experiments. While online
experiments are generally thought to be distinguished by their large sample
size, we focus on the multiplicity of treatment groups. The typical analysis
practice is to use simple differences-in-means (perhaps with covariate
adjustment) as if all treatment arms were independent. In this work we develop
consistent, small bias, shrinkage estimators for this setting. In addition to
achieving lower mean squared error these estimators retain important
frequentist properties such as coverage under most reasonable scenarios. Modern
sequential methods of experimentation and optimization such as multi-armed
bandit optimization (where treatment allocations adapt over time to prior
responses) benefit from the use of our shrinkage estimators. Exploration under
empirical Bayes focuses more efficiently on near-optimal arms, improving the
resulting decisions made under uncertainty. We demonstrate these properties by
examining seventeen large-scale experiments conducted on Facebook from April to
June 2017
Online Maximum k-Coverage
We study an online model for the maximum k-vertex-coverage problem, where given a graph G = (V,E) and an integer k, we ask for a subset A ⊆ V, such that |A | = k and the number of edges covered by A is maximized. In our model, at each step i, a new vertex vi is revealed, and we have to decide whether we will keep it or discard it. At any time of the process, only k vertices can be kept in memory; if at some point the current solution already contains k vertices, any inclusion of any new vertex in the solution must entail the irremediable deletion of one vertex of the current solution (a vertex not kept when revealed is irremediably deleted). We propose algorithms for several natural classes of graphs (mainly regular and bipartite), improving on an easy 1/2-competitive ratio. We next settle a set-version of the problem, called maximum k-(set)-coverage problem. For this problem we present an algorithm that improves upon former results for the same model for small and moderate values of k
Cross-Lingual Induction and Transfer of Verb Classes Based on Word Vector Space Specialisation
Existing approaches to automatic VerbNet-style verb classification are
heavily dependent on feature engineering and therefore limited to languages
with mature NLP pipelines. In this work, we propose a novel cross-lingual
transfer method for inducing VerbNets for multiple languages. To the best of
our knowledge, this is the first study which demonstrates how the architectures
for learning word embeddings can be applied to this challenging
syntactic-semantic task. Our method uses cross-lingual translation pairs to tie
each of the six target languages into a bilingual vector space with English,
jointly specialising the representations to encode the relational information
from English VerbNet. A standard clustering algorithm is then run on top of the
VerbNet-specialised representations, using vector dimensions as features for
learning verb classes. Our results show that the proposed cross-lingual
transfer approach sets new state-of-the-art verb classification performance
across all six target languages explored in this work.Comment: EMNLP 2017 (long paper
- …