297 research outputs found
Outfit Completion via Conditional Set Transformation
In this paper, we formulate the outfit completion problem as a set retrieval
task and propose a novel framework for solving this problem. The proposal
includes a conditional set transformation architecture with deep neural
networks and a compatibility-based regularization method. The proposed method
utilizes a map with permutation-invariant for the input set and
permutation-equivariant for the condition set. This allows retrieving a set
that is compatible with the input set while reflecting the properties of the
condition set. In addition, since this structure outputs the element of the
output set in a single inference, it can achieve a scalable inference speed
with respect to the cardinality of the output set. Experimental results on real
data reveal that the proposed method outperforms existing approaches in terms
of accuracy of the outfit completion task, condition satisfaction, and
compatibility of completion results.Comment: 8 pages, 8 figure
Policy-Adaptive Estimator Selection for Off-Policy Evaluation
Off-policy evaluation (OPE) aims to accurately evaluate the performance of
counterfactual policies using only offline logged data. Although many
estimators have been developed, there is no single estimator that dominates the
others, because the estimators' accuracy can vary greatly depending on a given
OPE task such as the evaluation policy, number of actions, and noise level.
Thus, the data-driven estimator selection problem is becoming increasingly
important and can have a significant impact on the accuracy of OPE. However,
identifying the most accurate estimator using only the logged data is quite
challenging because the ground-truth estimation accuracy of estimators is
generally unavailable. This paper studies this challenging problem of estimator
selection for OPE for the first time. In particular, we enable an estimator
selection that is adaptive to a given OPE task, by appropriately subsampling
available logged data and constructing pseudo policies useful for the
underlying estimator selection task. Comprehensive experiments on both
synthetic and real-world company data demonstrate that the proposed procedure
substantially improves the estimator selection compared to a non-adaptive
heuristic.Comment: accepted at AAAI'2
- …