48,274 research outputs found

    Multi-Task Learning for Email Search Ranking with Auxiliary Query Clustering

    Full text link
    User information needs vary significantly across different tasks, and therefore their queries will also differ considerably in their expressiveness and semantics. Many studies have been proposed to model such query diversity by obtaining query types and building query-dependent ranking models. These studies typically require either a labeled query dataset or clicks from multiple users aggregated over the same document. These techniques, however, are not applicable when manual query labeling is not viable, and aggregated clicks are unavailable due to the private nature of the document collection, e.g., in email search scenarios. In this paper, we study how to obtain query type in an unsupervised fashion and how to incorporate this information into query-dependent ranking models. We first develop a hierarchical clustering algorithm based on truncated SVD and varimax rotation to obtain coarse-to-fine query types. Then, we study three query-dependent ranking models, including two neural models that leverage query type information as additional features, and one novel multi-task neural model that views query type as the label for the auxiliary query cluster prediction task. This multi-task model is trained to simultaneously rank documents and predict query types. Our experiments on tens of millions of real-world email search queries demonstrate that the proposed multi-task model can significantly outperform the baseline neural ranking models, which either do not incorporate query type information or just simply feed query type as an additional feature.Comment: CIKM 201

    Relation Discovery from Web Data for Competency Management

    Get PDF
    This paper describes a technique for automatically discovering associations between people and expertise from an analysis of very large data sources (including web pages, blogs and emails), using a family of algorithms that perform accurate named-entity recognition, assign different weights to terms according to an analysis of document structure, and access distances between terms in a document. My contribution is to add a social networking approach called BuddyFinder which relies on associations within a large enterprise-wide "buddy list" to help delimit the search space and also to provide a form of 'social triangulation' whereby the system can discover documents from your colleagues that contain pertinent information about you. This work has been influential in the information retrieval community generally, as it is the basis of a landmark system that achieved overall first place in every category in the Enterprise Search Track of TREC2006

    ImageSieve: Exploratory search of museum archives with named entity-based faceted browsing

    Get PDF
    Over the last few years, faceted search emerged as an attractive alternative to the traditional "text box" search and has become one of the standard ways of interaction on many e-commerce sites. However, these applications of faceted search are limited to domains where the objects of interests have already been classified along several independent dimensions, such as price, year, or brand. While automatic approaches to generate faceted search interfaces were proposed, it is not yet clear to what extent the automatically-produced interfaces will be useful to real users, and whether their quality can match or surpass their manually-produced predecessors. The goal of this paper is to introduce an exploratory search interface called ImageSieve, which shares many features with traditional faceted browsing, but can function without the use of traditional faceted metadata. ImageSieve uses automatically extracted and classified named entities, which play important roles in many domains (such as news collections, image archives, etc.). We describe one specific application of ImageSieve for image search. Here, named entities extracted from the descriptions of the retrieved images are used to organize a faceted browsing interface, which then helps users to make sense of and further explore the retrieved images. The results of a user study of ImageSieve demonstrate that a faceted search system based on named entities can help users explore large collections and find relevant information more effectively
    corecore