3 research outputs found

    Serendipitous Exploration of Large-scale Product Catalogs

    Get PDF
    Abstract-Online shopping has developed to a stage where catalogs have become very large and diverse. Thus, it is a challenge to present relevant items to potential customers within a very few interactions. This is even more so when users have no defined shopping objectives but operate in an opportunistic mindset. This problem is often tackled by recommender systems. However, these systems rely on consistent user interaction patterns to predict items of interest. In contrast, we propose to adapt the classical information retrieval (IR) paradigm for the purpose of accessing catalog items in a context of un-predictable user interaction. Accordingly, we present a novel information access strategy based on the notion of interest rather than relevance. We detail the design of a scalable browsing system including learning capabilities joint with a limited-memory model. Our approach enables locating interesting items within a few steps while not requiring good quality descriptions. Our system allows customer to seamlessly change browsing objectives without having to start explicitly a new session. An evaluation of our approach based on both artificial and real-life datasets demonstrates its efficiency in learning and adaptation. I. MOTIVATION The emergence of online shopping has offered new opportunities to propose services and products to customers. Currently, many online shops are not anymore restricted to a certain category of products. For example Amazon, initially focused on cultural and entertainment media (books, music, and video), is now offering products as diverse as home appliances or jewelry. Even more crucial, we usually find thousands of items within a product category, e.g. 38 million books and 3,5 million jewelry items on Amazon. Both the breadth of product lines and the depth within a product line not only boost the volume of the catalogs but also make it difficult for the customer to find products of interest without an accurate search protocol. Presenting relevant products to potential customers is the goal of recommender systems. Independent of their type (collaborative filtering systems, content-based recommender, etc), recommender systems usually operate on a user profile gained from previous shopping sessions. For this reason, recommender systems suffer from the cold-start problem, when new users and/or new products appear In contrast to the above, our approach does not require the definition of a user profile nor it imposes specific search sessions with pre-defined objectives. In other words, we present an efficient product access strategy enabling intuitive browsing by estimating the user's intention from his/her input to the system and displaying items that are considered as most interesting to him/her (and thus likely to be purchased). Our new information access strategy is based on the notion of current interest rather than on the notion of relevance classically used in Information Retrieval (O1) We accommodate serendipity. We assume no pre-defined (fixed) objective of the user's chain of actions; (O2) The system matches classic (simple) interaction models; (O3) The system is scalable in terms of the volume of the product catalog. Our approach results in an interactive navigation system, which let the user operate naturally over the product catalog while swiftly reacting to changes in the browsing objectives. The major difference with earlier approaches is a rapidly adapting system, that copes with radical changes, and is scalable to operate over realistic-scale product catalogs. The remainder of the paper is structured as follows: in section II, we discuss relevant approaches for information characterisation and content access strategies in large repositories. In section III, we present our interaction model, which describes the type of interaction that is expected from the user and what information is carried over with this interaction. We formalise our navigation model, anticipating functional issues in section IV. In particular, we review its properties ensuring scalability and compatibility with other models. In section V, we propose a comprehensive assessment of the performance of our model in an adaptive browsing scenario. At every browsing step, the system aims at displaying the most useful items to the user with respect to past interaction. Although our study includes an inherent temporal dimension, which makes the evaluation context different from that of classical searc

    Recuperação multimodal e interativa de informação orientada por diversidade

    Get PDF
    Orientador: Ricardo da Silva TorresTese (doutorado) - Universidade Estadual de Campinas, Instituto de ComputaçãoResumo: Os métodos de Recuperação da Informação, especialmente considerando-se dados multimídia, evoluíram para a integração de múltiplas fontes de evidência na análise de relevância de itens em uma tarefa de busca. Neste contexto, para atenuar a distância semântica entre as propriedades de baixo nível extraídas do conteúdo dos objetos digitais e os conceitos semânticos de alto nível (objetos, categorias, etc.) e tornar estes sistemas adaptativos às diferentes necessidades dos usuários, modelos interativos que consideram o usuário mais próximo do processo de recuperação têm sido propostos, permitindo a sua interação com o sistema, principalmente por meio da realimentação de relevância implícita ou explícita. Analogamente, a promoção de diversidade surgiu como uma alternativa para lidar com consultas ambíguas ou incompletas. Adicionalmente, muitos trabalhos têm tratado a ideia de minimização do esforço requerido do usuário em fornecer julgamentos de relevância, à medida que mantém níveis aceitáveis de eficácia. Esta tese aborda, propõe e analisa experimentalmente métodos de recuperação da informação interativos e multimodais orientados por diversidade. Este trabalho aborda de forma abrangente a literatura acerca da recuperação interativa da informação e discute sobre os avanços recentes, os grandes desafios de pesquisa e oportunidades promissoras de trabalho. Nós propusemos e avaliamos dois métodos de aprimoramento do balanço entre relevância e diversidade, os quais integram múltiplas informações de imagens, tais como: propriedades visuais, metadados textuais, informação geográfica e descritores de credibilidade dos usuários. Por sua vez, como integração de técnicas de recuperação interativa e de promoção de diversidade, visando maximizar a cobertura de múltiplas interpretações/aspectos de busca e acelerar a transferência de informação entre o usuário e o sistema, nós propusemos e avaliamos um método multimodal de aprendizado para ranqueamento utilizando realimentação de relevância sobre resultados diversificados. Nossa análise experimental mostra que o uso conjunto de múltiplas fontes de informação teve impacto positivo nos algoritmos de balanceamento entre relevância e diversidade. Estes resultados sugerem que a integração de filtragem e re-ranqueamento multimodais é eficaz para o aumento da relevância dos resultados e também como mecanismo de potencialização dos métodos de diversificação. Além disso, com uma análise experimental minuciosa, nós investigamos várias questões de pesquisa relacionadas à possibilidade de aumento da diversidade dos resultados e a manutenção ou até mesmo melhoria da sua relevância em sessões interativas. Adicionalmente, nós analisamos como o esforço em diversificar afeta os resultados gerais de uma sessão de busca e como diferentes abordagens de diversificação se comportam para diferentes modalidades de dados. Analisando a eficácia geral e também em cada iteração de realimentação de relevância, nós mostramos que introduzir diversidade nos resultados pode prejudicar resultados iniciais, enquanto que aumenta significativamente a eficácia geral em uma sessão de busca, considerando-se não apenas a relevância e diversidade geral, mas também o quão cedo o usuário é exposto ao mesmo montante de itens relevantes e nível de diversidadeAbstract: Information retrieval methods, especially considering multimedia data, have evolved towards the integration of multiple sources of evidence in the analysis of the relevance of items considering a given user search task. In this context, for attenuating the semantic gap between low-level features extracted from the content of the digital objects and high-level semantic concepts (objects, categories, etc.) and making the systems adaptive to different user needs, interactive models have brought the user closer to the retrieval loop allowing user-system interaction mainly through implicit or explicit relevance feedback. Analogously, diversity promotion has emerged as an alternative for tackling ambiguous or underspecified queries. Additionally, several works have addressed the issue of minimizing the required user effort on providing relevance assessments while keeping an acceptable overall effectiveness. This thesis discusses, proposes, and experimentally analyzes multimodal and interactive diversity-oriented information retrieval methods. This work, comprehensively covers the interactive information retrieval literature and also discusses about recent advances, the great research challenges, and promising research opportunities. We have proposed and evaluated two relevance-diversity trade-off enhancement work-flows, which integrate multiple information from images, such as: visual features, textual metadata, geographic information, and user credibility descriptors. In turn, as an integration of interactive retrieval and diversity promotion techniques, for maximizing the coverage of multiple query interpretations/aspects and speeding up the information transfer between the user and the system, we have proposed and evaluated a multimodal learning-to-rank method trained with relevance feedback over diversified results. Our experimental analysis shows that the joint usage of multiple information sources positively impacted the relevance-diversity balancing algorithms. Our results also suggest that the integration of multimodal-relevance-based filtering and reranking was effective on improving result relevance and also boosted diversity promotion methods. Beyond it, with a thorough experimental analysis we have investigated several research questions related to the possibility of improving result diversity and keeping or even improving relevance in interactive search sessions. Moreover, we analyze how much the diversification effort affects overall search session results and how different diversification approaches behave for the different data modalities. By analyzing the overall and per feedback iteration effectiveness, we show that introducing diversity may harm initial results whereas it significantly enhances the overall session effectiveness not only considering the relevance and diversity, but also how early the user is exposed to the same amount of relevant items and diversityDoutoradoCiência da ComputaçãoDoutor em Ciência da ComputaçãoP-4388/2010140977/2012-0CAPESCNP

    Combining Multimodal Preferences for Multimedia Information Retrieval

    No full text
    Representing and fusing multimedia information is a key issue to discover semantics in multimedia. In this paper we address more specifically the problem of multimedia content retrieval by first defining a novel preference-based representation particularly adapted to the fusion problem, and then, by investigating the RankBoost algorithm to combine those preferences and a learn multimodal retrieval model. The approach has been tested on annotated images and on the complete TRECVID 2005 corpus and compared with SVMbased fusion strategies. The results show that our approach equals SVM performance but, contrary to SVM, is parameter free and faster
    corecore