92 research outputs found
Hierarchical Text Classification with Reinforced Label Assignment
While existing hierarchical text classification (HTC) methods attempt to
capture label hierarchies for model training, they either make local decisions
regarding each label or completely ignore the hierarchy information during
inference. To solve the mismatch between training and inference as well as
modeling label dependencies in a more principled way, we formulate HTC as a
Markov decision process and propose to learn a Label Assignment Policy via deep
reinforcement learning to determine where to place an object and when to stop
the assignment process. The proposed method, HiLAP, explores the hierarchy
during both training and inference time in a consistent manner and makes
inter-dependent decisions. As a general framework, HiLAP can incorporate
different neural encoders as base models for end-to-end training. Experiments
on five public datasets and four base models show that HiLAP yields an average
improvement of 33.4% in Macro-F1 over flat classifiers and outperforms
state-of-the-art HTC methods by a large margin. Data and code can be found at
https://github.com/morningmoni/HiLAP.Comment: EMNLP 201
A Survey on Datasets for Decision-making of Autonomous Vehicle
Autonomous vehicles (AV) are expected to reshape future transportation
systems, and decision-making is one of the critical modules toward high-level
automated driving. To overcome those complicated scenarios that rule-based
methods could not cope with well, data-driven decision-making approaches have
aroused more and more focus. The datasets to be used in developing data-driven
methods dramatically influences the performance of decision-making, hence it is
necessary to have a comprehensive insight into the existing datasets. From the
aspects of collection sources, driving data can be divided into vehicle,
environment, and driver related data. This study compares the state-of-the-art
datasets of these three categories and summarizes their features including
sensors used, annotation, and driving scenarios. Based on the characteristics
of the datasets, this survey also concludes the potential applications of
datasets on various aspects of AV decision-making, assisting researchers to
find appropriate ones to support their own research. The future trends of AV
dataset development are summarized
Rec4Ad: A Free Lunch to Mitigate Sample Selection Bias for Ads CTR Prediction in Taobao
Click-Through Rate (CTR) prediction serves as a fundamental component in
online advertising. A common practice is to train a CTR model on advertisement
(ad) impressions with user feedback. Since ad impressions are purposely
selected by the model itself, their distribution differs from the inference
distribution and thus exhibits sample selection bias (SSB) that affects model
performance. Existing studies on SSB mainly employ sample re-weighting
techniques which suffer from high variance and poor model calibration. Another
line of work relies on costly uniform data that is inadequate to train
industrial models. Thus mitigating SSB in industrial models with a
uniform-data-free framework is worth exploring. Fortunately, many platforms
display mixed results of organic items (i.e., recommendations) and sponsored
items (i.e., ads) to users, where impressions of ads and recommendations are
selected by different systems but share the same user decision rationales.
Based on the above characteristics, we propose to leverage recommendations
samples as a free lunch to mitigate SSB for ads CTR model (Rec4Ad). After
elaborating data augmentation, Rec4Ad learns disentangled representations with
alignment and decorrelation modules for enhancement. When deployed in Taobao
display advertising system, Rec4Ad achieves substantial gains in key business
metrics, with a lift of up to +6.6\% CTR and +2.9\% RPM
End-to-End Reinforcement Learning for Automatic Taxonomy Induction
We present a novel end-to-end reinforcement learning approach to automatic
taxonomy induction from a set of terms. While prior methods treat the problem
as a two-phase task (i.e., detecting hypernymy pairs followed by organizing
these pairs into a tree-structured hierarchy), we argue that such two-phase
methods may suffer from error propagation, and cannot effectively optimize
metrics that capture the holistic structure of a taxonomy. In our approach, the
representations of term pairs are learned using multiple sources of information
and used to determine \textit{which} term to select and \textit{where} to place
it on the taxonomy via a policy network. All components are trained in an
end-to-end manner with cumulative rewards, measured by a holistic tree metric
over the training taxonomies. Experiments on two public datasets of different
domains show that our approach outperforms prior state-of-the-art taxonomy
induction methods up to 19.6\% on ancestor F1.Comment: 11 Pages. ACL 2018 Camera Read
- …