2,410 research outputs found
FMMRec: Fairness-aware Multimodal Recommendation
Recently, multimodal recommendations have gained increasing attention for
effectively addressing the data sparsity problem by incorporating
modality-based representations. Although multimodal recommendations excel in
accuracy, the introduction of different modalities (e.g., images, text, and
audio) may expose more users' sensitive information (e.g., gender and age) to
recommender systems, resulting in potentially more serious unfairness issues.
Despite many efforts on fairness, existing fairness-aware methods are either
incompatible with multimodal scenarios, or lead to suboptimal fairness
performance due to neglecting sensitive information of multimodal content. To
achieve counterfactual fairness in multimodal recommendations, we propose a
novel fairness-aware multimodal recommendation approach (dubbed as FMMRec) to
disentangle the sensitive and non-sensitive information from modal
representations and leverage the disentangled modal representations to guide
fairer representation learning. Specifically, we first disentangle biased and
filtered modal representations by maximizing and minimizing their sensitive
attribute prediction ability respectively. With the disentangled modal
representations, we mine the modality-based unfair and fair (corresponding to
biased and filtered) user-user structures for enhancing explicit user
representation with the biased and filtered neighbors from the corresponding
structures, followed by adversarially filtering out sensitive information.
Experiments on two real-world public datasets demonstrate the superiority of
our FMMRec relative to the state-of-the-art baselines. Our source code is
available at https://anonymous.4open.science/r/FMMRec
Investigating Speaker Embedding Disentanglement on Natural Read Speech
Disentanglement is the task of learning representations that identify and
separate factors that explain the variation observed in data. Disentangled
representations are useful to increase the generalizability, explainability,
and fairness of data-driven models. Only little is known about how well such
disentanglement works for speech representations. A major challenge when
tackling disentanglement for speech representations are the unknown generative
factors underlying the speech signal. In this work, we investigate to what
degree speech representations encoding speaker identity can be disentangled. To
quantify disentanglement, we identify acoustic features that are highly
speaker-variant and can serve as proxies for the factors of variation
underlying speech. We find that disentanglement of the speaker embedding is
limited when trained with standard objectives promoting disentanglement but can
be improved over vanilla representation learning to some extent.Comment: To be published at 15th ITG conference on speech communicatio
Challenging Common Assumptions in the Unsupervised Learning of Disentangled Representations
The key idea behind the unsupervised learning of disentangled representations
is that real-world data is generated by a few explanatory factors of variation
which can be recovered by unsupervised learning algorithms. In this paper, we
provide a sober look at recent progress in the field and challenge some common
assumptions. We first theoretically show that the unsupervised learning of
disentangled representations is fundamentally impossible without inductive
biases on both the models and the data. Then, we train more than 12000 models
covering most prominent methods and evaluation metrics in a reproducible
large-scale experimental study on seven different data sets. We observe that
while the different methods successfully enforce properties ``encouraged'' by
the corresponding losses, well-disentangled models seemingly cannot be
identified without supervision. Furthermore, increased disentanglement does not
seem to lead to a decreased sample complexity of learning for downstream tasks.
Our results suggest that future work on disentanglement learning should be
explicit about the role of inductive biases and (implicit) supervision,
investigate concrete benefits of enforcing disentanglement of the learned
representations, and consider a reproducible experimental setup covering
several data sets
COFFEE: Counterfactual Fairness for Personalized Text Generation in Explainable Recommendation
Personalized text generation has broad industrial applications, such as
explanation generation for recommendations, conversational systems, etc.
Personalized text generators are usually trained on user written text, e.g.,
reviews collected on e-commerce platforms. However, due to historical, social,
or behavioral reasons, there may exist bias that associates certain linguistic
quality of user written text with the users' protected attributes such as
gender, race, etc. The generators can identify and inherit these correlations
and generate texts discriminately w.r.t. the users' protected attributes.
Without proper intervention, such bias can adversarially influence the users'
trust and reliance on the system. From a broader perspective, bias in
auto-generated contents can reinforce the social stereotypes about how online
users write through interactions with the users.
In this work, we investigate the fairness of personalized text generation in
the setting of explainable recommendation. We develop a general framework for
achieving measure-specific counterfactual fairness on the linguistic quality of
personalized explanations. We propose learning disentangled representations for
counterfactual inference and develop a novel policy learning algorithm with
carefully designed rewards for fairness optimization. The framework can be
applied for achieving fairness on any given specifications of linguistic
quality measures, and can be adapted to most of existing models and real-world
settings. Extensive experiments demonstrate the superior ability of our method
in achieving fairness while maintaining high generation performance
- …