115 research outputs found
Query weighting for ranking model adaptation
We propose to directly measure the importance of queries in the source domain to the target domain where no rank labels of documents are available, which is referred to as query weighting. Query weighting is a key step in ranking model adaptation. As the learning object of ranking algorithms is divided by query instances, we argue that it’s more reasonable to conduct importance weighting at query level than document level. We present two query weighting schemes. The first compresses the query into a query feature vector, which aggregates all document instances in the same query, and then conducts query weighting based on the query feature vector. This method can efficiently estimate query importance by compressing query data, but the potential risk is information loss resulted from the compression. The second measures the similarity between the source query and each target query, and then combines these fine-grained similarity values for its importance estimation. Adaptation experiments on LETOR3.0 data set demonstrate that query weighting significantly outperforms document instance weighting methods.
TransPrompt v2: A Transferable Prompting Framework for Cross-task Text Classification
Text classification is one of the most imperative tasks in natural language
processing (NLP). Recent advances with pre-trained language models (PLMs) have
shown remarkable success on this task. However, the satisfying results obtained
by PLMs heavily depend on the large amounts of task-specific labeled data,
which may not be feasible in many application scenarios due to data access and
privacy constraints. The recently-proposed prompt-based fine-tuning paradigm
improves the performance of PLMs for few-shot text classification with
task-specific templates. Yet, it is unclear how the prompting knowledge can be
transferred across tasks, for the purpose of mutual reinforcement. We propose
TransPrompt v2, a novel transferable prompting framework for few-shot learning
across similar or distant text classification tasks. For learning across
similar tasks, we employ a multi-task meta-knowledge acquisition (MMA)
procedure to train a meta-learner that captures the cross-task transferable
knowledge. For learning across distant tasks, we further inject the task type
descriptions into the prompt, and capture the intra-type and inter-type prompt
embeddings among multiple distant tasks. Additionally, two de-biasing
techniques are further designed to make the trained meta-learner more
task-agnostic and unbiased towards any tasks. After that, the meta-learner can
be adapted to each specific task with better parameters initialization.
Extensive experiments show that TransPrompt v2 outperforms single-task and
cross-task strong baselines over multiple NLP tasks and datasets. We further
show that the meta-learner can effectively improve the performance of PLMs on
previously unseen tasks. In addition, TransPrompt v2 also outperforms strong
fine-tuning baselines when learning with full training sets
Learning Vertex Representations for Bipartite Networks
Recent years have witnessed a widespread increase of interest in network
representation learning (NRL). By far most research efforts have focused on NRL
for homogeneous networks like social networks where vertices are of the same
type, or heterogeneous networks like knowledge graphs where vertices (and/or
edges) are of different types. There has been relatively little research
dedicated to NRL for bipartite networks. Arguably, generic network embedding
methods like node2vec and LINE can also be applied to learn vertex embeddings
for bipartite networks by ignoring the vertex type information. However, these
methods are suboptimal in doing so, since real-world bipartite networks concern
the relationship between two types of entities, which usually exhibit different
properties and patterns from other types of network data. For example,
E-Commerce recommender systems need to capture the collaborative filtering
patterns between customers and products, and search engines need to consider
the matching signals between queries and webpages
- …