2,505 research outputs found

    Modelling Efficient Novelty-based Search Result Diversification in Metric Spaces

    Get PDF
    Novelty-based diversification provides a way to tackle ambiguous queries by re-ranking a set of retrieved documents. Current approaches are typically greedy, requiring O(n2) document–document comparisons in order to diversify a ranking of n documents. In this article, we introduce a new approach for novelty-based search result diversification to reduce the overhead incurred by document–document comparisons. To this end, we model novelty promotion as a similarity search in a metric space, exploiting the properties of this space to efficiently identify novel documents. We investigate three different approaches: pivoting-based, clustering-based, and permutation-based. In the first two, a novel document is one that lies outside the range of a pivot or outside a cluster. In the latter, a novel document is one that has a different signature (i.e., the documentʼs relative distance to a distinguished set of fixed objects called permutants) compared to previously selected documents. Thorough experiments using two TREC test collections for diversity evaluation, as well as a large sample of the query stream of a commercial search engine show that our approaches perform at least as effectively as well-known novelty-based diversification approaches in the literature, while dramatically improving their efficiency.Fil: Gil Costa, Graciela Verónica. Yahoo; México. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico San Luis; ArgentinaFil: Santos, Rodrygo L. T.. University Of Glasgow; Reino UnidoFil: Macdonald, Craig. University Of Glasgow; Reino UnidoFil: Ounis, Iadh. University Of Glasgow; Reino Unid

    Sparse spatial selection for novelty-based search result diversification

    Get PDF
    Abstract. Novelty-based diversification approaches aim to produce a diverse ranking by directly comparing the retrieved documents. However, since such approaches are typically greedy, they require O(n 2) documentdocument comparisons in order to diversify a ranking of n documents. In this work, we propose to model novelty-based diversification as a similarity search in a sparse metric space. In particular, we exploit the triangle inequality property of metric spaces in order to drastically reduce the number of required document-document comparisons. Thorough experiments using three TREC test collections show that our approach is at least as effective as existing novelty-based diversification approaches, while improving their efficiency by an order of magnitude.

    Explicit web search result diversification

    Get PDF
    Queries submitted to a web search engine are typically short and often ambiguous. With the enormous size of the Web, a misunderstanding of the information need underlying an ambiguous query can misguide the search engine, ultimately leading the user to abandon the originally submitted query. In order to overcome this problem, a sensible approach is to diversify the documents retrieved for the user's query. As a result, the likelihood that at least one of these documents will satisfy the user's actual information need is increased. In this thesis, we argue that an ambiguous query should be seen as representing not one, but multiple information needs. Based upon this premise, we propose xQuAD---Explicit Query Aspect Diversification, a novel probabilistic framework for search result diversification. In particular, the xQuAD framework naturally models several dimensions of the search result diversification problem in a principled yet practical manner. To this end, the framework represents the possible information needs underlying a query as a set of keyword-based sub-queries. Moreover, xQuAD accounts for the overall coverage of each retrieved document with respect to the identified sub-queries, so as to rank highly diverse documents first. In addition, it accounts for how well each sub-query is covered by the other retrieved documents, so as to promote novelty---and hence penalise redundancy---in the ranking. The framework also models the importance of each of the identified sub-queries, so as to appropriately cater for the interests of the user population when diversifying the retrieved documents. Finally, since not all queries are equally ambiguous, the xQuAD framework caters for the ambiguity level of different queries, so as to appropriately trade-off relevance for diversity on a per-query basis. The xQuAD framework is general and can be used to instantiate several diversification models, including the most prominent models described in the literature. In particular, within xQuAD, each of the aforementioned dimensions of the search result diversification problem can be tackled in a variety of ways. In this thesis, as additional contributions besides the xQuAD framework, we introduce novel machine learning approaches for addressing each of these dimensions. These include a learning to rank approach for identifying effective sub-queries as query suggestions mined from a query log, an intent-aware approach for choosing the ranking models most likely to be effective for estimating the coverage and novelty of multiple documents with respect to a sub-query, and a selective approach for automatically predicting how much to diversify the documents retrieved for each individual query. In addition, we perform the first empirical analysis of the role of novelty as a diversification strategy for web search. As demonstrated throughout this thesis, the principles underlying the xQuAD framework are general, sound, and effective. In particular, to validate the contributions of this thesis, we thoroughly assess the effectiveness of xQuAD under the standard experimentation paradigm provided by the diversity task of the TREC 2009, 2010, and 2011 Web tracks. The results of this investigation demonstrate the effectiveness of our proposed framework. Indeed, xQuAD attains consistent and significant improvements in comparison to the most effective diversification approaches in the literature, and across a range of experimental conditions, comprising multiple input rankings, multiple sub-query generation and coverage estimation mechanisms, as well as queries with multiple levels of ambiguity. Altogether, these results corroborate the state-of-the-art diversification performance of xQuAD

    How Can Innovation in Urban Agriculture Contribute to Sustainability? A Characterization and Evaluation Study from Five Western European Cities

    Get PDF
    Compared to rural agriculture, urban agriculture (UA) has some distinct features (e.g., the limited land access, alternative growing media, unique legal environments or the non-production-related missions) that encourage the development of new practices, i.e., \u201cnovelties\u201d or \u201cinnovations\u201d. This paper aims to (1) identify the \u201ctriggers\u201d for novelty production in UA; (2) characterize the different kinds of novelties applied in UA; (3) evaluate the \u201cinnovativeness\u201d of those social, environmental and economic novelties; and, (4) estimate the links between novelties and sustainability. The study was based on the evaluation of 11 case studies in four Western European countries (Italy, Germany, France and Spain). The results show that the trigger and origin of new activities can often be traced back to specific problems that initiators were intended to address or solve. In total, we found 147 novelties produced in the 11 case studies. More novelties are produced in the environmental and social dimensions of sustainability than in the economic. In most cases, external stakeholders played an important role in supporting the projects. The analysis further suggests that innovativeness enhances the overall sustainability in urban agriculture projects

    Novelty grammar swarms

    Get PDF
    Tese de mestrado, Engenharia Informática (Sistemas de Informação), Universidade de Lisboa, Faculdade de Ciências, 2015Particle Swarm Optimization (PSO) é um dos métodos de optimização populacionais mais conhecido. Normalmente é aplicado na otimização funções de fitness, que indicam o quão perto o algoritmo está de atingir o objectivo da pesquisa, fazendo com que esta se foque em áreas de fitness mais elevado. Em problemas com muitos ótimos locais, regularmente a pesquisa fica presa em locais com fitness elevado mas que não são o verdadeiro objetivo. Com vista a solucionar este problema em certos domínios, nesta tese é introduzido o Novelty-driven Particle Swarm Optimization (NdPSO). Este algoritmo é inspirado na pesquisa pela novidade (novelty search), um método relativamente recente que guia a pesquisa de forma a encontrar instâncias significativamente diferentes das anteriores. Desta forma, o NdPSO ignora por completo o objetivo perseguindo apenas a novidade, isto torna-o menos susceptivel a ser enganado em problemas com muitos optimos locais. Uma vez que o novelty search mostrou potencial a resolver tarefas no âmbito da programação genética, em particular na evolução gramatical, neste projeto o NdPSO é usado como uma extensão do método de Grammatical Swarm que é uma combinação do PSO com a programação genética. A implementação do NdPSO é testada em três domínios diferentes, representativos daqueles para o qual este algoritmo poderá ser mais vantajoso que os algoritmos guiados pelo objectivo. Isto é, domínios enganadores nos quais seja relativamente intuitivo descrever um comportamento. Em cada um dos domínios testados, o NdPSO supera o aloritmo standard do PSO, uma das suas variantes mais conhecidas (Barebones PSO) e a pesquisa aleatória, mostrando ser uma ferramenta promissora para resolver problemas enganadores. Uma vez que esta é a primeira aplicação da pesquisa por novidade fora do paradigma evolucionário, neste projecto é também efectuado um estudo comparativo do novo algoritmo com a forma mais comum de usar a pesquisa pela novidade (na forma de algoritmo evolucionário).Particle Swarm Optimization (PSO) is a well-known population-based optimization algorithm. Most often it is applied to optimize fitness functions that specify the goal of reaching a desired objective or behavior. As a result, search focuses on higher-fitness areas. In problems with many local optima, search often becomes stuck, and thus can fail to find the intended objective. To remedy this problem in certain kinds of domains, this thesis introduces Novelty-driven Particle Swarm Optimization (NdPSO). Taking motivation from the novelty search algorithm in evolutionary computation, in this method search is driven only towards finding instances significantly different from those found before. In this way, NdPSO completely ignores the objective in its pursuit of novelty, making it less susceptible to deception and local optima. Because novelty search has previously shown potential for solving tasks in Genetic Programming, particularly, in Grammatical Evolution, this paper implements NdPSO as an extension of the Grammatical Swarm method which in effect is a combination of PSO and Genetic Programming.The resulting NdPSO implementation was tested in three different domains representative of those in which it might provide advantage over objective-driven PSO, in particular, those which are deceptive and in which a meaningful high-level description of novel behavior is easy to derive. In each of the tested domains NdPSO outperforms both objective-based PSO and random-search, demonstrating its promise as a tool for solving deceptive problems. Since this is the first application of the search for novelty outside the evolutionary paradigm an empirical comparative study of the new algorithm to a standard novelty search Evolutionary Algorithm is performed
    corecore