8 research outputs found

    Web Workload Generation According to the UniLoG Approach

    Get PDF
    Generating synthetic loads which are suffciently close to reality represents an important and challenging task in performance and quality-of-service (QoS) evaluations of computer networks and distributed systems. Here, the load to be generated represents sequences of requests at a well-defined service interface within a network node. The paper presents a tool (UniLoG.HTTP) which can be used in a flexible manner to generate realistic and representative server and network loads, in terms of access requests to Web servers as well as creation of typical Web traffic within a communication network. The paper describes the architecture of this load generator, the critical design decisions and solution approaches which allowed us to obtain the desired flexibility

    Global evaluation of CDNs performance using PlanetLab

    Get PDF
    Since they were introduced in the market, Content Distribution Networks (CDNs) have been increasing their importance due to the “instantaneity” requirements pretended by nowadays web users. Thanks to the increment in the access speed, especially in the last mile with technologies such as xDSL, HFC, FTTH, the loading time has been reduced. However the “instantaneity” those users want could not be obtained without techniques such as caches and content distribution due to CDNs. These techniques aim to avoid fetching web objects from origin web server, especially in “heavy” objects such as multimedia files. CDN provides not only a clever way of distributing content in a globally, but also preventing problems such as the “flash crowd events”. This kind of situation could provoke huge monetary losses because it attacks the bottleneck introduced by clustering servers to reach scalability. The CDN leader provider is Akamai, and one of the most important decisions a CDN should perform is deciding witch of the available servers is the best one a user could use to be able to fetch a specific web object. This best server selection employs a technique based on DNS with the objective of mapping the IP address with the best available server in terms of latency. The current project presents a global performance of Akamai server selection technique using tools such as PlanetLab and Httperf. Different tests were done with the objective of comparing the results of the global distributed users to identify those areas where Akamai perform in a suitable way. To determinate this, the results obtained with Akamai were also compared with a non-CDN distribution web page. Finally a linear correlation between the latencies measured and the number of hops was identified.Castellà: Desde que fueron introducidas en el mercado las Redes de Distribución de Contenidos (CDN) ha incrementado su importancia debido a la tendencia de “instantaneidad” en la carga de las páginas web que actualmente pretenden los usuarios de Internet. Gracias al incremento en las velocidades de acceso sobretodo en la última milla con tecnologías como xDSL, HFC, FTTH, la velocidad de carga de las páginas webs se ha incrementado. Sin embargo esta “instantaneidad” ha sido posible gracias a diferentes técnicas como la utilización de caches y distribución de contenidos vía CDN. Estas técnicas tienen como objetivo evitar que la carga de los objetos web más “pesados” (como pueden ser los archivos multimedia) se haga desde el servidor origen. Las CDN proporcionan no sólo una forma efectiva de distribuir los contenidos de una manera global sino que también resuelven problemas como los “flash crowd events” que pueden llegar a ocasionar enormes perdidas monetarias debido a la inoperatividad que generan en la web origen. Uno de los proveedores más importantes de CDNs es Akamai y una de las decisiones más importantes que una CDN debe realizar es seleccionar el mejor servidor disponible en cierto instante de tiempo, para que un usuario pueda acceder al objeto web deseado. Para esto se utilizan técnicas basadas en DNS con el objetivo de “mappear” la dirección IP del servidor que presente mejor latencia. Este proyecto presenta una evaluación de performance, sobre la técnica de selección del mejor servidor que utiliza Akamai. Su comportamiento es evaluando de manera global gracias a la utilización de herramientas como PlanetLab y Httperf. En el mismo, se realizan diferentes pruebas que hacen hincapié en comparar los resultados desde puntos ubicados en diferentes zonas del planeta para así poder concluir en que zonas Akamai tiene mejor respuesta. Para ello se compararon los resultados obtenidos con una web que utiliza la CDN de Akamai con otra que no utiliza distribución de contenidos a través de CDN. Finalmente se trata de identificar una correlación entre las respuestas de latencia y cantidad de “hops”

    Impact of Location on Content Delivery

    Get PDF
    Steigende Benutzerzahlen und steigende Internetnutzung sind seit über 15 Jahren verantwortlich für ein exponentielles Wachstum des Internetverkehrs. Darüber hinaus haben neue Applikationen und Anwendungsfälle zu einer Veränderung der Eigenschaften des Verkehrs geführt. Zum Beispiel erlauben soziale Netze dem Benutzer die Veröffentlichung eigener Inhalte. Diese benutzergenerierten Inhalte werden häufig auf beliebten Webseiten wie YouTube, Twitter oder Facebook publiziert. Weitere Beispiele sind die Angebote an interaktiven oder multimedialen Inhalten wie Google Maps oder Fernsehdienste (IPTV). Die Einführung von Peer-to-Peer-Protokollen (P2P) im Jahre 1998 bewirkte einen noch radikaleren Wandel, da sie den direkten Austausch von großen Mengen an Daten erlauben: Die Peers übertragen die Daten ohne einen dazwischenliegenden, oft zentralisierten Server. Allerdings zeigen aktuelle Forschungsarbeiten, dass Internetverkehr wieder von HTTP dominiert wird, zum Großteil auf Kosten von P2P. Dieses Verkehrswachstum erhöht die Anforderungen an die Komponenten aus denen das Internet aufgebaut ist, z.B. Server und Router. Darüber hinaus wird der Großteil des Verkehrs von wenigen, sehr beliebten Diensten erzeugt. Die gewaltige Nachfrage nach solchen beliebten Inhalten kann nicht mehr durch das traditionelle Hostingmodell gedeckt werden, bei dem jeder Inhalt nur auf einem Server verfügbar gemacht wird. Stattdessen müssen Inhalteanbieter ihre Infrastruktur ausweiten, z.B. indem sie sie in großen Datenzentren vervielfältigen, oder indem sie den Dienst einer Content Distribution Infrastructure wie Akamai oder Limelight in Anspruch nehmen. Darüber hinaus müssen nicht nur die Anbieter von Inhalten sich der Nachfrage anpassen: Auch die Netzwerkinfrastruktur muss kontinuierlich mit der ständig steigenden Nachfrage mitwachsen. In dieser Doktorarbeit charakterisieren wir die Auswirkung von Content Delivery auf das Netzwerk. Wir nutzen Datensätze aus aktiven und aus passiven Messungen, die es uns ermöglichen, das Problem auf verschiedenen Abstraktionsebenen zu untersuchen: vom detaillierten Verhalten auf der Protokollebene von verschiedenen Content Delivery-Methoden bis hin zum ganzheitlichen Bild des Identifizierens und Kartographierens der Content Distribution Infrastructures, die für die populärsten Inhalte verantwortlich sind. Unsere Ergebnisse zeigen, dass das Cachen von Inhalten immer noch ein schwieriges Problem darstellt und dass die Wahl des DNS-Resolvers durch den Nutzer einen ausgeprägten Einfluß auf den Serverwahlmechanismus der Content Distribution Infrastructure hat. Wir schlagen vor, Webinhalte zu kartographieren, um darauf rückschließen zu können, wie Content Distribution Infrastructures ausgerollt sind und welche Rollen verschiedene Organisationen im Internet einnehmen. Wir schließen die Arbeit ab, indem wir unsere Ergebnisse mit zeitnahen Arbeiten vergleichen und geben Empfehlungen, wie man die Auslieferung von Inhalten weiter verbessern kann, an alle betroffenen Parteien: Benutzer, Internetdienstanbieter und Content Distribution Infrastructures.The increasing number of users as well as their demand for more and richer content has led to an exponential growth of Internet traffic for more than 15 years. In addition, new applications and use cases have changed the type of traffic. For example, social networking enables users to publish their own content. This user generated content is often published on popular sites such as YouTube, Twitter, and Facebook. Another example are the offerings of interactive and multi-media content by content providers, e.g., Google Maps or IPTV services. With the introduction of peer-to-peer (P2P) protocols in 1998 an even more radical change emerged because P2P protocols allow users to directly exchange large amounts of content: The peers transfer data without the need for an intermediary and often centralized server. However, as shown by recent studies Internet traffic is again dominated by HTTP, mostly at the expense of P2P. This traffic growth increases the demands on the infrastructure components that form the Internet, e.g., servers and routers. Moreover, most of the traffic is generated by a few very popular services. The enormous demand for such popular content cannot be satisfied by the traditional hosting model in which content is located on a single server. Instead, content providers need to scale up their delivery infrastructure, e.g., by using replication in large data centers or by buying service from content delivery infrastructures, e.g., Akamai or Limelight. Moreover, not only content providers have to cope with the demand: The network infrastructure also needs to be constantly upgraded to keep up with the growing demand for content. In this thesis we characterize the impact of content delivery on the network. We utilize data sets from both active and passive measurements. This allows us to cover a wide range of abstraction levels from a detailed protocol level view of several content delivery mechanisms to the high-level picture of identifying and mapping the content infrastructures that are hosting the most popular content. We find that caching content is still hard and that the user's choice of DNS resolvers has a profound impact on the server selection mechanism of content distribution infrastructures. We propose Web content cartography to infer how content distribution infrastructures are deployed and what the role of different organizations in the Internet is. We conclude by putting our findings in the context of contemporary work and give recommendations on how to improve content delivery to all parties involved: users, Internet service providers, and content distribution infrastructures

    Content consistency for web-based information retrieval

    Get PDF
    Master'sMASTER OF SCIENC

    Optimization inWeb Caching: Cache Management, Capacity Planning, and Content Naming

    Full text link
    Caching is fundamental to performance in distributed information retrieval systems such as the World Wide Web. This thesis introduces novel techniques for optimizing performance and cost-effectiveness in Web cache hierarchies. When requests are served by nearby caches rather than distant servers, server loads and network traffic decrease and transactions are faster. Cache system design and management, however, face extraordinary challenges in loosely-organized environments like the Web, where the many components involved in content creation, transport, and consumption are owned and administered by different entities. Such environments call for decentralized algorithms in which stakeholders act on local information and private preferences. In this thesis I consider problems of optimally designing new Web cache hierarchies and optimizing existing ones. The methods I introduce span the Web from point of content creation to point of consumption: I quantify the impact of content-naming practices on cache performance; present techniques for variable-quality-of-service cache management; describe how a decentralized algorithm can compute economically-optimal cache sizes in a branching two-level cache hierarchy; and introduce a new protocol extension that eliminates redundant data transfers and allows “dynamic” content to be cached consistently. To evaluate several of my new methods, I conducted trace-driven simulations on an unprecedented scale. This in turn required novel workload measurement methods and efficient new characterization and simulation techniques. The performance benefits of my proposed protocol extension are evaluated using two extraordinarily large and detailed workload traces collected in a traditional corporate network environment and an unconventional thin-client system. My empirical research follows a simple but powerful paradigm: measure on a large scale an important production environment’s exogenous workload; identify performance bounds inherent in the workload, independent of the system currently serving it; identify gaps between actual and potential performance in the environment under study; and finally devise ways to close these gaps through component modifications or through improved inter-component integration. This approach may be applicable to a wide range of Web services as they mature.Ph.D.Computer Science and EngineeringUniversity of Michiganhttp://deepblue.lib.umich.edu/bitstream/2027.42/90029/1/kelly-optimization_web_caching.pdfhttp://deepblue.lib.umich.edu/bitstream/2027.42/90029/2/kelly-optimization_web_caching.ps.bz

    Modeling and acceleration of content delivery in world wide web

    Get PDF
    Ph.DDOCTOR OF PHILOSOPH
    corecore