Search CORE

9 research outputs found

Static index pruning in web search engines: Combining term and document popularities with query views

Author: Ozcan R.
Sengor Altingovde I.
Ulusoy O.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/02/2012
Field of study

Cataloged from PDF version of article.Static index pruning techniques permanently remove a presumably redundant part of an inverted file, to reduce the file size and query processing time. These techniques differ in deciding which parts of an index can be removed safely; that is, without changing the top-ranked query results. As defined in the literature, the query view of a document is the set of query terms that access to this particular document, that is, retrieves this document among its top results. In this paper, we first propose using query views to improve the quality of the top results compared against the original results. We incorporate query views in a number of static pruning strategies, namely term-centric, document-centric, term popularity based and document access popularity based approaches, and show that the new strategies considerably outperform their counterparts especially for the higher levels of pruning and for both disjunctive and conjunctive query processing. Additionally, we combine the notions of term and document access popularity to form new pruning strategies, and further extend these strategies with the query views. The new strategies improve the result quality especially for the conjunctive query processing, which is the default and most common search mode of a search engine

Bilkent University Institutional Repository

Cluster searching strategies for collaborative recommendation systems

Author: Sengor Altingovde I.
Subakan O. N.
Ulusoy O.
Publication venue: 'Elsevier BV'
Publication date: 01/01/2013
Field of study

Cataloged from PDF version of article.In-memory nearest neighbor computation is a typical collaborative filtering approach for high recommendation accuracy. However, this approach is not scalable given the huge number of customers and items in typical commercial applications. Cluster-based collaborative filtering techniques can be a remedy for the efficiency problem, but they usually provide relatively lower accuracy figures, since they may become over-generalized and produce less-personalized recommendations. Our research explores an individualistic strategy which initially clusters the users and then exploits the members within clusters, but not just the cluster representatives, during the recommendation generation stage. We provide an efficient implementation of this strategy by adapting a specifically tailored cluster- skipping inverted index structure. Experimental results reveal that the individualistic strategy with the cluster-skipping index is a good compromise that yields high accuracy and reasonable scalability figures. © 2012 Elsevier Ltd. All rights reserved

Bilkent University Institutional Repository

OpenMETU (Middle East Technical University)

Cache-based query processing for search engines

Author: Cambazoglu B. B.
Ozcan R.
Sengor Altingovde I.
Ulusoy O.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/11/2012
Field of study

Cataloged from PDF version of article.In practice, a search engine may fail to serve a query due to various reasons such as hardware/network failures, excessive query load, lack of matching documents, or service contract limitations (e.g., the query rate limits for third-party users of a search service). In this kind of scenarios, where the backend search system is unable to generate answers to queries, approximate answers can be generated by exploiting the previously computed query results available in the result cache of the search engine.In this work, we propose two alternative strategies to implement this cache-based query processing idea. The first strategy aggregates the results of similar queries that are previously cached in order to create synthetic results for new queries. The second strategy forms an inverted index over the textual information (i.e., query terms and result snippets) present in the result cache and uses this index to answer new queries. Both approaches achieve reasonable result qualities compared to processing queries with an inverted index built on the collection. © 2012 ACM

Bilkent University Institutional Repository

OpenMETU (Middle East Technical University)

A five-level static cache architecture for web search engines

Author: Barla Cambazoglu B.
Junqueira F.P.
Ozcan R.
Sengor Altingovde I.
Ulusoy Ö.
Publication venue: 'Elsevier BV'
Publication date: 01/01/2012
Field of study

Caching is a crucial performance component of large-scale web search engines, as it greatly helps reducing average query response times and query processing workloads on backend search clusters. In this paper, we describe a multi-level static cache architecture that stores five different item types: query results, precomputed scores, posting lists, precomputed intersections of posting lists, and documents. Moreover, we propose a greedy heuristic to prioritize items for caching, based on gains computed by using items' past access frequencies, estimated computational costs, and storage overheads. This heuristic takes into account the inter-dependency between individual items when making its caching decisions, i.e.; after a particular item is cached, gains of all items that are affected by this decision are updated. Our simulations under realistic assumptions reveal that the proposed heuristic performs better than dividing the entire cache space among particular item types at fixed proportions. © 2010 Elsevier Ltd. All rights reserved

Bilkent University Institutional Repository

Ottoman archives explorer

Author: Allam M.
Altingovde I. S.
Beitzel S. M.
Ismail Sengor Altingovde
Ismet Zeki Yalniz
Kilic N.
Oard D.
Uğur Güdükbay
Özgür Ulusoy
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date
Field of study

Crossref

The impact of solid state drive on search engine cache management

Author: Altingovde I. Sengor
Baeza-Yates Ricardo
Bernstein Philip A.
Chen Shimin
Debnath Biplob
Kawaguchi Atsuo
Kim Hyojun
Ma Ruyue
Park Stan
Ricardo
Saxena Mohit
Seo Euiseong
Wu Chin
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date
Field of study

Crossref