75 research outputs found
Cross-Domain Labeled LDA for Cross-Domain Text Classification
Cross-domain text classification aims at building a classifier for a target
domain which leverages data from both source and target domain. One promising
idea is to minimize the feature distribution differences of the two domains.
Most existing studies explicitly minimize such differences by an exact
alignment mechanism (aligning features by one-to-one feature alignment,
projection matrix etc.). Such exact alignment, however, will restrict models'
learning ability and will further impair models' performance on classification
tasks when the semantic distributions of different domains are very different.
To address this problem, we propose a novel group alignment which aligns the
semantics at group level. In addition, to help the model learn better semantic
groups and semantics within these groups, we also propose a partial supervision
for model's learning in source domain. To this end, we embed the group
alignment and a partial supervision into a cross-domain topic model, and
propose a Cross-Domain Labeled LDA (CDL-LDA). On the standard 20Newsgroup and
Reuters dataset, extensive quantitative (classification, perplexity etc.) and
qualitative (topic detection) experiments are conducted to show the
effectiveness of the proposed group alignment and partial supervision.Comment: ICDM 201
Exploring the Confounding Factors of Academic Career Success: An Empirical Study with Deep Predictive Modeling
Understanding determinants of success in academic careers is critically
important to both scholars and their employing organizations. While
considerable research efforts have been made in this direction, there is still
a lack of a quantitative approach to modeling the academic careers of scholars
due to the massive confounding factors. To this end, in this paper, we propose
to explore the determinants of academic career success through an empirical and
predictive modeling perspective, with a focus on two typical academic honors,
i.e., IEEE Fellow and ACM Fellow. We analyze the importance of different
factors quantitatively, and obtain some insightful findings. Specifically, we
analyze the co-author network and find that potential scholars work closely
with influential scholars early on and more closely as they grow. Then we
compare the academic performance of male and female Fellows. After comparison,
we find that to be elected, females need to put in more effort than males. In
addition, we also find that being a Fellow could not bring the improvements of
citations and productivity growth. We hope these derived factors and findings
can help scholars to improve their competitiveness and develop well in their
academic careers
DPR: An Algorithm Mitigate Bias Accumulation in Recommendation feedback loops
Recommendation models trained on the user feedback collected from deployed
recommendation systems are commonly biased. User feedback is considerably
affected by the exposure mechanism, as users only provide feedback on the items
exposed to them and passively ignore the unexposed items, thus producing
numerous false negative samples. Inevitably, biases caused by such user
feedback are inherited by new models and amplified via feedback loops.
Moreover, the presence of false negative samples makes negative sampling
difficult and introduces spurious information in the user preference modeling
process of the model. Recent work has investigated the negative impact of
feedback loops and unknown exposure mechanisms on recommendation quality and
user experience, essentially treating them as independent factors and ignoring
their cross-effects. To address these issues, we deeply analyze the data
exposure mechanism from the perspective of data iteration and feedback loops
with the Missing Not At Random (\textbf{MNAR}) assumption, theoretically
demonstrating the existence of an available stabilization factor in the
transformation of the exposure mechanism under the feedback loops. We further
propose Dynamic Personalized Ranking (\textbf{DPR}), an unbiased algorithm that
uses dynamic re-weighting to mitigate the cross-effects of exposure mechanisms
and feedback loops without additional information. Furthermore, we design a
plugin named Universal Anti-False Negative (\textbf{UFN}) to mitigate the
negative impact of the false negative problem. We demonstrate theoretically
that our approach mitigates the negative effects of feedback loops and unknown
exposure mechanisms. Experimental results on real-world datasets demonstrate
that models using DPR can better handle bias accumulation and the universality
of UFN in mainstream loss methods
- …