1,487 research outputs found

    Upper Bound Approximations for BlockMaxWand

    Get PDF
    BlockMaxWand is a recent advance on the Wand dynamic pruning technique, which allows efficient retrieval without any effectiveness degradation to rank K. However, while BMW uses docid-sorted indices, it relies on recording the upper bound of the term weighting model scores for each block of postings in the inverted index. Such a requirement can be disadvantageous in situations such as when an index must be updated. In this work, we examine the appropriateness of upper-bound approximation – which have previously been shown suitable for Wand– in providing efficient retrieval for BMW. Experiments on the ClueWeb12 category B13 corpus using 5000 queries from a real search engine’s query log demonstrate that BMW still provides benefits w.r.t. Wand when approximate upper bounds are used, and that, if approximations on upper bounds are tight, BMW with approximate upper bounds can provide efficiency gains w.r.t.Wand with exact upper bounds, in particular for queries of short to medium length

    Climate Change Attribution Using Empirical Decomposition of Climatic Data

    Full text link
    The climate change attribution problem is addressed using empirical decomposition. Cycles in solar motion and activity of 60 and 20 years were used to develop an empirical model of Earth temperature variations. The model was fit to the Hadley global temperature data up to 1950 (time period before anthropogenic emissions became the dominant forcing mechanism), and then extrapolated from 1951 to 2009. After subtraction of the model, the residuals showed an approximate linear upward trend after 1942. Herein we assume that the residual upward warming observed during the second half of the 20th century has been mostly induced by a worldwide rapid increase of anthropogenic emissions, urbanization and land use change. The warming observed before 1942 is relatively small and it is assumed to have been mostly naturally induced by a climatic recovery since the Little Ice Age of the 17th century and the Dalton Minimum at the beginning of the 19th century. The resulting full natural plus anthropogenic model fits the entire 160 year record very well. Residual analysis does not provide any evidence for a substantial cooling effect due to sulfate aerosols from 1940 to 1970. The cooling observed during that period may be due to a natural 60-year cycle, which is visible in the global temperature since 1850 and has been observed also in numerous multisecular climatic records. New solar activity proxy models are developed that suggest a mechanism for both the 60-year climate cycle and a portion of the long-term warming trend. Our results suggest that because current models underestimate the strength of natural multidecadal cycles in the temperature records, the anthropogenic contribution to climate change since 1970 should be around half of that previously claimed by the IPCC [2007]. A 21st Century forecast suggests that climate may warm less than 1^{\circ}C by 2100

    Efficient & Effective Selective Query Rewriting with Efficiency Predictions

    Get PDF
    To enhance effectiveness, a user's query can be rewritten internally by the search engine in many ways, for example by applying proximity, or by expanding the query with related terms. However, approaches that benefit effectiveness often have a negative impact on efficiency, which has impacts upon the user satisfaction, if the query is excessively slow. In this paper, we propose a novel framework for using the predicted execution time of various query rewritings to select between alternatives on a per-query basis, in a manner that ensures both effectiveness and efficiency. In particular, we propose the prediction of the execution time of ephemeral (e.g., proximity) posting lists generated from uni-gram inverted index posting lists, which are used in establishing the permissible query rewriting alternatives that may execute in the allowed time. Experiments examining both the effectiveness and efficiency of the proposed approach demonstrate that a 49% decrease in mean response time (and 62% decrease in 95th-percentile response time) can be attained without significantly hindering the effectiveness of the search engine

    What’s new in e-assessment? from computer-marking to innovative item types

    Get PDF
    What’s new in e-assessment? from computer-marking to innovative item type

    Queuing theory-based latency/power tradeoff models for replicated search engines

    Get PDF
    Large-scale search engines are built upon huge infrastructures involving thousands of computers in order to achieve fast response times. In contrast, the energy consumed (and hence the financial cost) is also high, leading to environmental damage. This paper proposes new approaches to increase energy and financial savings in large-scale search engines, while maintaining good query response times. We aim to improve current state-of-the-art models used for balancing power and latency, by integrating new advanced features. On one hand, we propose to improve the power savings by completely powering down the query servers that are not necessary when the load of the system is low. Besides, we consider energy rates into the model formulation. On the other hand, we focus on how to accurately estimate the latency of the whole system by means of Queueing Theory. Experiments using actual query logs attest the high energy (and financial) savings regarding current baselines. To the best of our knowledge, this is the first paper in successfully applying stationary Queueing Theory models to estimate the latency in a large-scale search engine

    Designing Frameworks to Deliver Unknown Information to Support MBIs

    Get PDF
    This paper reports on a Catchment Modelling Framework (CMF) designed to support an Australian pilot of an auction for multiple environmental outcomes EcoTender. The CMF is used to estimate multiple environmental outcomes including carbon, terrestrial biodiversity, aquatic function (water quality and quantity) and saline land area. This information was previously unavailable for application to environmental markets. This is the first time a market-based policy has been fully integrated from desk to field with a Catchment Modelling Framework for the purchase of multiple outcomes. This framework solves the unknown information problem of linking paddock scale landuse and management to catchment-scale environmental outcomes. The framework provides the Victorian government with a replicable transparent evidence-based approach to the procurement of environment outcomes.Research Methods/ Statistical Methods,

    Efficient query processing for scalable web search

    Get PDF
    Search engines are exceptionally important tools for accessing information in today’s world. In satisfying the information needs of millions of users, the effectiveness (the quality of the search results) and the efficiency (the speed at which the results are returned to the users) of a search engine are two goals that form a natural trade-off, as techniques that improve the effectiveness of the search engine can also make it less efficient. Meanwhile, search engines continue to rapidly evolve, with larger indexes, more complex retrieval strategies and growing query volumes. Hence, there is a need for the development of efficient query processing infrastructures that make appropriate sacrifices in effectiveness in order to make gains in efficiency. This survey comprehensively reviews the foundations of search engines, from index layouts to basic term-at-a-time (TAAT) and document-at-a-time (DAAT) query processing strategies, while also providing the latest trends in the literature in efficient query processing, including the coherent and systematic reviews of techniques such as dynamic pruning and impact-sorted posting lists as well as their variants and optimisations. Our explanations of query processing strategies, for instance the WAND and BMW dynamic pruning algorithms, are presented with illustrative figures showing how the processing state changes as the algorithms progress. Moreover, acknowledging the recent trends in applying a cascading infrastructure within search systems, this survey describes techniques for efficiently integrating effective learned models, such as those obtained from learning-to-rank techniques. The survey also covers the selective application of query processing techniques, often achieved by predicting the response times of the search engine (known as query efficiency prediction), and making per-query tradeoffs between efficiency and effectiveness to ensure that the required retrieval speed targets can be met. Finally, the survey concludes with a summary of open directions in efficient search infrastructures, namely the use of signatures, real-time, energy-efficient and modern hardware and software architectures
    • …
    corecore