Search CORE

4 research outputs found

Document replication strategies for geographically distributed web search engines

Author: Aykanat C.
Cambazoglu B. B.
Kayaaslan E.
Publication venue: 'Elsevier BV'
Publication date: 01/01/2013
Field of study

Cataloged from PDF version of article.Large-scale web search engines are composed of multiple data centers that are geographically distant to each other. Typically, a user query is processed in a data center that is geographically close to the origin of the query, over a replica of the entire web index. Compared to a centralized, single-center search engine, this architecture offers lower query response times as the network latencies between the users and data centers are reduced. However, it does not scale well with increasing index sizes and query traffic volumes because queries are evaluated on the entire web index, which has to be replicated and maintained in all data centers. As a remedy to this scalability problem, we propose a document replication framework in which documents are selectively replicated on data centers based on regional user interests. Within this framework, we propose three different document replication strategies, each optimizing a different objective: reducing the potential search quality loss, the average query response time, or the total query workload of the search system. For all three strategies, we consider two alternative types of capacity constraints on index sizes of data centers. Moreover, we investigate the performance impact of query forwarding and result caching. We evaluate our strategies via detailed simulations, using a large query log and a document collection obtained from the Yahoo! web search engine. (C) 2012 Elsevier Ltd. All rights reserved

Bilkent University Institutional Repository

ILP Modeling of Many-to-Many Replicated Multimedia Communication, Journal of Telecommunications and Information Technology, 2013, nr 3

Author: Bulira Damian
Careglio Davide
Walkowiak Krzysztof
Publication venue: Instytut Łączności - Państwowy Instytut Badawczy, Warszawa
Publication date
Field of study

On-line communication services were evolving from a simple text-based chats towards sophisticated videopresence appliances. The bandwidth consumption of those services is constantly growing due to the technology development and high user and business needs. That fact leads us to implement optimization mechanisms into the multimedia communication scenarios. In this paper, the authors concentrate on many-to-many (m2m) communication, that is mainly driven by the growing popularity of on-line conferences and telepresence applications. An overlay model where m2m flows are optimally established on top of a given set of network routes is formulated and a joint model where the network routes and the m2m flows are jointly optimized. In the models, the traffic traverses through replica servers, that are responsible for stream aggregation and compression. Models for both predefined replica locations and optimized server settlement are presented. Each model is being followed by a comprehensive description and is based on real teleconference systems

Biblioteka Cyfrowa Instytutu Łączności / National Institute of Telecomunications: Digital Library

Journal of Telecommunications and Information Technology, nr 3

Author
Publication venue: 'National Institute of Telecommunications'
Publication date
Field of study

kwartalni

Biblioteka Cyfrowa Instytutu Łączności / National Institute of Telecomunications: Digital Library

An Overview of Data Replication on the Internet

Author: Dimitris Papadias
Ishfaq Ahmad
Thanasis Loukopoulos
Publication venue
Publication date: 01/01/2002
Field of study

The proliferation of the Internet is leading to high expectation on the fast turnaround time. Clients abandoning their connections due to excessive downloading delays translates directly to profit losses. Hence, minimizing the latency perceived by end-users has become the primary performance objective compared to more traditional issues, such as server utilization. The two promising techniques to improve the Internet responsiveness are caching and replication. In this paper we present an overview of recent research in replication. We begin by arguing on the important role of replication in decreasing client perceived response time and proceed by illustrating the main topics that affect its successful deployment on the Internet. We analyze and characterize existing research, providing taxonomies and classifications whenever possible. Our discussion reveals several open problems and research directions.

CiteSeerX