Search CORE

50 research outputs found

Static index pruning in web search engines

Author: Altıngövde İsmail Sengör
Ozcan Rifat
Ulusoy Özgür
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/02/2012
Field of study

Static index pruning techniques permanently remove a presumably redundant part of an inverted file, to reduce the file size and query processing time. These techniques differ in deciding which parts of an index can be removed safely; that is, without changing the top-ranked query results. As defined in the literature, the query view of a document is the set of query terms that access to this particular document, that is, retrieves this document among its top results. In this paper, we first propose using query views to improve the quality of the top results compared against the original results. We incorporate query views in a number of static pruning strategies, namely term-centric, document-centric, term popularity based and document access popularity based approaches, and show that the new strategies considerably outperform their counterparts especially for the higher levels of pruning and for both disjunctive and conjunctive query processing. Additionally, we combine the notions of term and document access popularity to form new pruning strategies, and further extend these strategies with the query views. The new strategies improve the result quality especially for the conjunctive query processing, which is the default and most common search mode of a search engine

OpenMETU (Middle East Technical University)

Evolution of web search results within years

Author: Altıngövde İsmail Sengör
Ozcan Rifat
Ulusoy Özgür
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2011
Field of study

We provide a first large-scale analysis of the evolution of query results obtained from a real search engine at two distant points in time, namely, in 2007 and 2010, for a set of 630,000 real queries

Crossref

OpenMETU (Middle East Technical University)

HIV/AIDS, demography and development: individual choices versus public policies in SSA

Author: A. Young
Akira Yakita
Alwyn Young
Anatole Romaniuk
Andrew N Phillips
D. Fassin
Daniel T. Halperin
David de la Croix
David de la Croix
Douglas Gollin
Edward C. Green
Jacob Levi
John Bongaarts
Jonathan Guryan
Joydeep Bhattacharya
Klaus Prettner
Luciano Fanti
Markus Haacker
Massimo Livi-Bacci
N P Simelela
Oded Galor
Paul Collier
Peter Lorentzen
Raph L Hamers
Rifat Atun
Sebnem Kalemli-Ozcan
Sebnem Kalemli-Ozcan
Shankha Chakraborty
Shankha Chakraborty
Shankha Chakraborty
Stephen Resch
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2019
Field of study

Despite the increasing rate of diffusion of effective therapies, the battle against HIV/AIDS in Sub-Saharan Africa (SSA) is far from being over. Three main challenges are that the epidemics might paralyse or reverse the fertility transition, the expansion of the resources needed to finance the fight against HIV, and the emerging resistance to anti-retroviral treatments. This research proposes a UGT-like model showing the complexity of the interplay amongst the (macro)economy, the epidemics, their endogenous feedback on mortality and fertility and the central role of policy actions aimed to fight HIV. The disease-induced increase in adult mortality can hamper economic development by its upward pressure on the precautionary demand for children and downward pressure on education. This can dramatically reduce physical and human capital accumulation

Crossref

Archivio della ricerca - Università degli studi di Napoli Federico II

Archivio della Ricerca - Università di Pisa

Concept-based Information Retrieval Using Ontologies and Latent Semantic Analysis

Author: Rifat Ozcan
Rifat Ozcan
Y. Alp Asl
Y. Alp Aslandogan
Publication venue
Publication date
Field of study

CiteSeerX

A Cost-Aware Strategy for Query Result Caching in Web Search Engines

Author: Altıngövde İsmail Sengör
Ozcan Rifat
Ulusoy Oezguer
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2009
Field of study

Search engines and large scale IR systems need to cache query results for efficiency and scalability purposes. In this study, we propose to explicitly incorporate the query costs in the static caching policy. To this end, a query’s cost is represented by its execution time, which involves CPU time to decompress the postings and compute the query-document similarities to obtain the final top-N answers. Simulation results using a large Web crawl data and a real query log reveal that the proposed strategy improves overall system performance in terms of the total query execution time

OpenMETU (Middle East Technical University)

Static query result caching revisited

Author: Ismail Sengor Altingovde
Rifat Ozcan
Özgür Ulusoy
Publication venue
Publication date: 15/12/2008
Field of study

Query result caching is an important mechanism for search engine efficiency. In this study, we first review several query features that are used to determine the contents of a static result cache. Next, we introduce a new feature that more accurately represents the popularity of a query by measuring the stability of query frequency over a set of time intervals. Experimental results show that this new feature achieves hit ratios better than those of the previously proposed features

CiteSeerX

OpenMETU (Middle East Technical University)

Space efficient caching of query results in search engines

Author: Altıngövde İsmail Sengör
Ozcan Rifat
Ulusoy Oezguer
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2008
Field of study

Web search engines serve millions of query requests per day. Caching query results is one of the most crucial mechanisms to cope with such a demanding load. In this paper, we propose an efficient storage model to cache document identifiers of query results. Essentially, we first cluster queries that have common result documents. Next, for each cluster, we attempt to store those common document identifiers in a more compact manner. Experimental results reveal that the proposed storage model achieves space reduction of up to 4%. The proposed model is envisioned to improve the cache hit rate and system throughput as it allows storing more query results within a particular cache space, in return to a negligible increase in the cost of preparing the final query result page

Crossref

OpenMETU (Middle East Technical University)

A practitioner’s guide for static index pruning

Author: Ismail Sengor Altingovde
Rifat Ozcan
Özgür Ulusoy
Publication venue
Publication date: 01/01/2009
Field of study

Abstract. We compare the term- and document-centric static index pruning approaches as described in the literature and investigate their sensitivity to the scoring functions employed during the pruning and actual retrieval stages. 1 Static Inverted Index Pruning Static index pruning permanently removes some information from the index, for the purposes of utilizing the disk space and improving query processing efficiency. In the literature, several approaches are investigated for the static index pruning techniques. Among those methods, the term-centric pruning (referred to as TCP hereafter) proposed in [3] is shown to be very successful at keeping the top-k (k≤30) answers almost unchanged for the queries while significantly reducing the index size. In a nutshell, TCP scores (using the Smart’s TFIDF function) and sorts the postings of each term in the collection and removes the tail of the list according to some decision criteria. In [1], instead of the TFIDF function, BM25 is employed during the pruning and retrieval stages. In that study, it’s shown that by tuning the pruning algorithm according to the score function, it is possible to further boost the performance

CiteSeerX

Crossref