73 research outputs found

    Document replication strategies for geographically distributed web search engines

    Get PDF
    Cataloged from PDF version of article.Large-scale web search engines are composed of multiple data centers that are geographically distant to each other. Typically, a user query is processed in a data center that is geographically close to the origin of the query, over a replica of the entire web index. Compared to a centralized, single-center search engine, this architecture offers lower query response times as the network latencies between the users and data centers are reduced. However, it does not scale well with increasing index sizes and query traffic volumes because queries are evaluated on the entire web index, which has to be replicated and maintained in all data centers. As a remedy to this scalability problem, we propose a document replication framework in which documents are selectively replicated on data centers based on regional user interests. Within this framework, we propose three different document replication strategies, each optimizing a different objective: reducing the potential search quality loss, the average query response time, or the total query workload of the search system. For all three strategies, we consider two alternative types of capacity constraints on index sizes of data centers. Moreover, we investigate the performance impact of query forwarding and result caching. We evaluate our strategies via detailed simulations, using a large query log and a document collection obtained from the Yahoo! web search engine. (C) 2012 Elsevier Ltd. All rights reserved

    Document Archiving, Replication and Migration Container for Mobile Web Users

    Full text link
    With the increasing use of mobile workstations for a wide variety of tasks and associated information needs, and with many variations of available networks, access to data becomes a prime consideration. This paper discusses issues of workstation mobility and proposes a solution wherein the data structures are accessed in an encapsulated form - through the Portable File System (PFS) wrapper. The paper discusses an implementation of the Portable File System, highlighting the architecture and commenting upon performance of an experimental system. Although investigations have been focused upon mobile access of WWW documents, this technique could be applied to any mobile data access situation.Comment: 5 page

    Adaptive Replicated Web Documents.

    Get PDF
    Caching and replication techniques can improve latency of the Web, while reducing network traffic and balancing load among servers. However, no single strategy is optimal for replicating all documents. Depending on its access pattern, each document should use the policy that suits it best. This paper presents an architecture for adaptive replicated documents. Each adaptive document monitors its access pattern, and uses it to determine which strategy it should follow. When a change is detected in its access pattern, it re-evaluates its strategy to adapt to the new conditions. Adaptation comes at an acceptable cost considering to the benefits of per-document replication strategies. vrije Universiteit Faculty of Mathematics and Computer Science 1 Introduction Most Web users suffer from slow document transfers. The reasons for such high latencies include distance between the user and the document, and load of the intermediate network. One common solution is to maintain copies of ..

    Object Distribution Networks for World-wide Document Circulation

    Get PDF
    This paper presents an Object Distribution System (ODS), a distributed system inspired by the ultra-large scale distribution models used in everyday life (e.g. food or newspapers distribution chains). Beyond traditional mechanisms of approaching information to readers (e.g. caching and mirroring), this system enables the publication, classification and subscription to volumes of objects (e.g. documents, events). Authors submit their contents to publication agents. Classification authorities provide classification schemes to classify objects. Readers subscribe to topics or authors, and retrieve contents from their local delivery agent (like a kiosk or library, with local copies of objects). Object distribution is an independent process where objects circulate asynchronously among distribution agents. ODS is designed to perform specially well in an increasingly populated, widespread and complex Internet jungle, using weak consistency replication by object distribution, asynchronous replication, and local access to objects by clients. ODS is based on two independent virtual networks, one dedicated to the distribution (replication) of objects and the other to calculate optimised distribution chains to be applied by the first network

    Trypanosoma brucei BRCA2 acts in a life cycle-specific genome stability process and dictates BRC repeat number-dependent RAD51 subnuclear dynamics

    Get PDF
    Trypanosoma brucei survives in mammals through antigenic variation, which is driven by RAD51-directed homologous recombination of Variant Surface Glycoproteins (VSG) genes, most of which reside in a subtelomeric repository of >1000 silent genes. A key regulator of RAD51 is BRCA2, which in T. brucei contains a dramatic expansion of a motif that mediates interaction with RAD51, termed the BRC repeats. BRCA2 mutants were made in both tsetse fly-derived and mammal-derived T. brucei, and we show that BRCA2 loss has less impact on the health of the former. In addition, we find that genome instability, a hallmark of BRCA2 loss in other organisms, is only seen in mammal-derived T. brucei. By generating cells expressing BRCA2 variants with altered BRC repeat numbers, we show that the BRC repeat expansion is crucial for RAD51 subnuclear dynamics after DNA damage. Finally, we document surprisingly limited co-localization of BRCA2 and RAD51 in the T. brucei nucleus, and we show that BRCA2 mutants display aberrant cell division, revealing a function distinct from BRC-mediated RAD51 interaction. We propose that BRCA2 acts to maintain the huge VSG repository of T. brucei, and this function has necessitated the evolution of extensive RAD51 interaction via the BRC repeats, allowing re-localization of the recombinase to general genome damage when needed

    EvaluaciĂłn del tiempo de respuesta de un geoservicio utilizando una base de datos hĂ­brida y distribuida

    Get PDF
    Web mapping services provide information directly to users and other software programs that can consume and produce information. One of the main challenges this type of service presents is improving its performance. Therefore, in this research, a new geoservice integrated into GeoServer was developed, called GeoToroTur, with an OWS implementation of vector layers that consumes the information from a hybrid and distributed database that was implemented with PostgreSQL and MongoDB, making use of ToroDB for document replication. This geoservice was evaluated by executing geographic and descriptive attribute filter queries. Based on the results, we can conclude that the response time for GeoToroTur is shorter than that for Geoserver.Los servicios de cartografía Web proporcionan información directamente, no sólo a los usuarios, sino también a otros programas de software que pueden consumir y producir información. Uno de los principales retos que presentan este tipo de servicios es mejorar su rendimiento. Por ello, en esta investigación se desarrolló un nuevo geoservicio integrado a GeoServer, denominado GeoToroTur con una implementación OWS de capas vectoriales que consume la información de una base de datos híbrida y distribuida que fue implementada con PostgreSQL y MongoDB haciendo uso de ToroDB para la replicación de documentos. Este geoservicio fue evaluado mediante la ejecución de consultas geogråficas y de filtro de atributos descriptivos. Los resultados obtenidos permiten concluir que el geoservicio GeoToroTur tiene un menor tiempo de respuesta que Geoserver

    Document distribution algorithm for load balancing on an extensible Web server architecture

    Get PDF
    Access latency and load balancing are the two main issues in the design of clustered Web server architecture for achieving high performance. We propose a novel document distribution algorithm for load balancing on a cluster of distributed Web servers. We group Web pages that are likely to be accessed during a request session into a migrating unit, which is used as the basic unit of document placement. A modified binning algorithm is developed to distribute the migrating units among the Web servers to fulfil the load balancing. We also present a redirection mechanism, which makes use of a migrating unit's property, to reduce the cost of request redirections. The distribution of Web documents would be recomputed periodically to adapt to the changes in client request patterns and system configuration. Simulation results show that our solution can reduce the amount of request redirection and document migration, and it can distribute workload properly among Web servers.published_or_final_versio

    Quality of Service Issues in Internet Web Services

    Get PDF
    Editorial special section on "Quality of Service Issues in Internet Web Services

    Object Replication Algorithms for World Wide Web

    Get PDF
    Object replication is a well-known technique to improve the accessibility of the Web sites. It generally offers reduced client latencies and increases a site's availability. However, applying replication techniques is not trivial and a large number of heuristics have been proposed to decide the number of replicas of an object and their placement in a distributed web server system. This paper presents three object placement and replication algorithms. The first two heuristics are centralized in the sense that a central site determines the number of replicas and their placement. Due to the dynamic nature of the Internet traffic and the rapid change in the access pattern of the World-Wide Web, we also propose a distributed algorithm where each site relies on some locally collected information to decide what objects should be replicated at that site. The performance of the proposed algorithms is evaluated through a simulation study. Also, the performance of the proposed algorithms has been compared with that of three other well-known algorithms and the results are presented. The simulation results demonstrate the effectiveness and superiority of the proposed algorithms
    • 

    corecore