47 research outputs found
Reproducibility Analysis and Enhancements for Multi-Aspect Dense Retriever with Aspect Learning
Multi-aspect dense retrieval aims to incorporate aspect information (e.g.,
brand and category) into dual encoders to facilitate relevance matching. As an
early and representative multi-aspect dense retriever, MADRAL learns several
extra aspect embeddings and fuses the explicit aspects with an implicit aspect
"OTHER" for final representation. MADRAL was evaluated on proprietary data and
its code was not released, making it challenging to validate its effectiveness
on other datasets. We failed to reproduce its effectiveness on the public
MA-Amazon data, motivating us to probe the reasons and re-examine its
components. We propose several component alternatives for comparisons,
including replacing "OTHER" with "CLS" and representing aspects with the first
several content tokens. Through extensive experiments, we confirm that learning
"OTHER" from scratch in aspect fusion is harmful. In contrast, our proposed
variants can greatly enhance the retrieval performance. Our research not only
sheds light on the limitations of MADRAL but also provides valuable insights
for future studies on more powerful multi-aspect dense retrieval models. Code
will be released at:
https://github.com/sunxiaojie99/Reproducibility-for-MADRAL.Comment: accepted by ecir2024 as a reproducibility pape
Feature-Enhanced Network with Hybrid Debiasing Strategies for Unbiased Learning to Rank
Unbiased learning to rank (ULTR) aims to mitigate various biases existing in
user clicks, such as position bias, trust bias, presentation bias, and learn an
effective ranker. In this paper, we introduce our winning approach for the
"Unbiased Learning to Rank" task in WSDM Cup 2023. We find that the provided
data is severely biased so neural models trained directly with the top 10
results with click information are unsatisfactory. So we extract multiple
heuristic-based features for multi-fields of the results, adjust the click
labels, add true negatives, and re-weight the samples during model training.
Since the propensities learned by existing ULTR methods are not decreasing
w.r.t. positions, we also calibrate the propensities according to the click
ratios and ensemble the models trained in two different ways. Our method won
the 3rd prize with a DCG@10 score of 9.80, which is 1.1% worse than the 2nd and
25.3% higher than the 4th.Comment: 5 pages, 1 figure, WSDM Cup 202
Pre-training with Aspect-Content Text Mutual Prediction for Multi-Aspect Dense Retrieval
Grounded on pre-trained language models (PLMs), dense retrieval has been
studied extensively on plain text. In contrast, there has been little research
on retrieving data with multiple aspects using dense models. In the scenarios
such as product search, the aspect information plays an essential role in
relevance matching, e.g., category: Electronics, Computers, and Pet Supplies. A
common way of leveraging aspect information for multi-aspect retrieval is to
introduce an auxiliary classification objective, i.e., using item contents to
predict the annotated value IDs of item aspects. However, by learning the value
embeddings from scratch, this approach may not capture the various semantic
similarities between the values sufficiently. To address this limitation, we
leverage the aspect information as text strings rather than class IDs during
pre-training so that their semantic similarities can be naturally captured in
the PLMs. To facilitate effective retrieval with the aspect strings, we propose
mutual prediction objectives between the text of the item aspect and content.
In this way, our model makes more sufficient use of aspect information than
conducting undifferentiated masked language modeling (MLM) on the concatenated
text of aspects and content. Extensive experiments on two real-world datasets
(product and mini-program search) show that our approach can outperform
competitive baselines both treating aspect values as classes and conducting the
same MLM for aspect and content strings. Code and related dataset will be
available at the URL \footnote{https://github.com/sunxiaojie99/ATTEMPT}.Comment: accepted by cikm202
A Multi-Granularity-Aware Aspect Learning Model for Multi-Aspect Dense Retrieval
Dense retrieval methods have been mostly focused on unstructured text and
less attention has been drawn to structured data with various aspects, e.g.,
products with aspects such as category and brand. Recent work has proposed two
approaches to incorporate the aspect information into item representations for
effective retrieval by predicting the values associated with the item aspects.
Despite their efficacy, they treat the values as isolated classes (e.g., "Smart
Homes", "Home, Garden & Tools", and "Beauty & Health") and ignore their
fine-grained semantic relation. Furthermore, they either enforce the learning
of aspects into the CLS token, which could confuse it from its designated use
for representing the entire content semantics, or learn extra aspect embeddings
only with the value prediction objective, which could be insufficient
especially when there are no annotated values for an item aspect. Aware of
these limitations, we propose a MUlti-granulaRity-aware Aspect Learning model
(MURAL) for multi-aspect dense retrieval. It leverages aspect information
across various granularities to capture both coarse and fine-grained semantic
relations between values. Moreover, MURAL incorporates separate aspect
embeddings as input to transformer encoders so that the masked language model
objective can assist implicit aspect learning even without aspect-value
annotations. Extensive experiments on two real-world datasets of products and
mini-programs show that MURAL outperforms state-of-the-art baselines
significantly.Comment: Accepted by WSDM2024, updat
Facilitating Interaction with Large Displays in Smart Spaces
Large displays are widely equipped in Smart Spaces these days. However, traditional interaction devices which are designed to suit desktop screen, such as mice, keyboards, have various limitations in such environments. In this paper, we present a novel human-computer interaction system, known as the CollabPointer, for facilitating interaction with large displays in Smart Spaces. A laser pointer integrated with three additional buttons and wireless communication modules is induced as input device in our system and three features distinguish the CollabPointer from other interaction technologies. First, the coordinates of the red laser point on the screen emitted by the laser pointer are interpreted as the cursor’s position and the additional buttons on it wirelessly emulate a mouse’s buttons through radio frequency. It enables remote interaction at any distance. Second, when multiple users are interacting, with two-steps associating methods described in this paper, our system can identify different laser pointers and support multi-user collaboration. Last but not least, the laser pointer emits its identity through radio frequency during interaction. The system receives it and treats different users separately. In the end, the CollabPointer has been implemented in the Smart Classroom [1]- a prototype of Smart Space, and the results of user studies show the benefit of it
uPen: Laser-based, Personalized, Multi-User Interaction on Large Displays
We present the uPen, a laser pointer combined with a contactpushed switch, three press buttons and a wireless communication module. This novel interaction device allows users to interact on large displays at a distance or directly on the surface with fullfunction of mouse. Onboard software enable the uPen system to identify different users and provide personalized services to them, such as associating users with corresponding privileges, giving access to each participant’s private content (e.g., home pages, personal calendars). Additionally, with our two-step association method, the uPen system has the ability to distinguish strokes of different uPens working simultaneously and support multi-user simultaneous interaction. A prototype system has been implemented in our Smart Classroom [1]. And user studies show the benefit of using it. Categories and Subject Descriptors H.5.2 [Information Interfaces and Presentation]: User Interfaces- interaction styles,input devices and strategies,theory and methods