3 research outputs found
Learning Optimal Card Ranking from Query Reformulation
Mobile search has recently been shown to be the major contributor to the
growing search market. The key difference between mobile search and desktop
search is that information presentation is limited to the screen space of the
mobile device. Thus, major search engines have adopted a new type of search
result presentation, known as \textit{information cards}, in which each card
presents summarized results from one domain/vertical, for a given query, to
augment the standard blue-links search results. While it has been widely
acknowledged that information cards are particularly suited to mobile user
experience, it is also challenging to optimize such result sets. Typically,
user engagement metrics like query reformulation are based on whole ranked list
of cards for each query and most traditional learning to rank algorithms
require per-item relevance labels. In this paper, we investigate the
possibility of interpreting query reformulation into effective relevance labels
for query-card pairs. We inherit the concept of conventional learning-to-rank,
and propose pointwise, pairwise and listwise interpretations for query
reformulation. In addition, we propose a learning-to-label strategy that learns
the contribution of each card, with respect to a query, where such
contributions can be used as labels for training card ranking models. We
utilize a state-of-the-art ranking model and demonstrate the effectiveness of
proposed mechanisms on a large-scale mobile data from a major search engine,
showing that models trained from labels derived from user engagement can
significantly outperform ones trained from human judgment labels
CARL: Aggregated Search with Context-Aware Module Embedding Learning
Aggregated search aims to construct search result pages (SERPs) from
blue-links and heterogeneous modules (such as news, images, and videos).
Existing studies have largely ignored the correlations between blue-links and
heterogeneous modules when selecting the heterogeneous modules to be presented.
We observe that the top ranked blue-links, which we refer to as the
\emph{context}, can provide important information about query intent and helps
identify the relevant heterogeneous modules. For example, informative terms
like "streamed" and "recorded" in the context imply that a video module may
better satisfy the query. To model and utilize the context information for
aggregated search, we propose a model with context attention and representation
learning (CARL). Our model applies a recurrent neural network with an attention
mechanism to encode the context, and incorporates the encoded context
information into module embeddings. The context-aware module embeddings
together with the ranking policy are jointly optimized under the Markov
decision process (MDP) formulation. To achieve a more effective joint learning,
we further propose an optimization function with self-supervision loss to
provide auxiliary supervision signals. Experimental results based on two public
datasets demonstrate the superiority of CARL over multiple baseline approaches,
and confirm the effectiveness of the proposed optimization function in boosting
the joint learning process.Comment: IJCNN201
A Unified Search Federation System Based on Online User Feedback
Today’s popular web search engines expand the search process beyond crawled web pages to specialized corpora (“verticals”) like images, videos, news, local, sports, finance, and shopping etc., each with its own specialized search engine. Search federation deals with problems of the selection of search engines to query and merging of their results into a single result set. Despite a few recent advances, the problem is still very challenging. First, due to the heterogeneous nature of different verticals, how the system merges the vertical results with the web documents to serve the user’s information need is still an open problem. Moreover, the scale of the search engine and the increasing number of vertical properties requires a solution which is efficient and scaleable. In this paper, we propose an unified framework for the searc