13 research outputs found
Performance impact of web services on Internet servers
While traditional Internet servers mainly served static and
later also dynamic content, the popularity of Web services
is increasing rapidly. Web services incorporate additional
overhead compared to traditional web interaction. This
overhead increases the demand on Internet servers which
is of particular importance when the request rate to the
server is high. We conduct experiments that show that the
imposed overhead of Web services is non-negligible
during server overload. In our experiments the response
time for Web services is more than 30% higher and the
server throughput more than 25% lower compared to
traditional web interaction using dynamically created
HTML pages
BADANIA REDUKCJI OPÓŹNIEŃ SERWERA WWW
This paper investigates the characteristics of web server response delay in order to understand and analyze the optimization techniques of reducing latency. The analysis of the latency behavior for multi-process Apache HTTP server with different thread count and various workloads, was made. It was indicated, that the insufficient number of threads used by the server handling the concurrent requests of clients, is responsible for increasing latency under various loads. The problem can be solved by using a modified web server configuration allowing to reduce the response time.W artykule opisano badania charakterystyk czasowych serwera WWW w celu zrozumienia i analizy technik optymalizacyjnych powodujących redukcję opóźnienia. Dokonano analizy czasów opóźnień dla wieloprocesowego serwera Apache dla różnej liczby wątków i obciążeń. Wskazano, że niewystarczająca liczba wątków wykorzystywanych przez serwer, obsługujących jednoczesne żądania klientów, wpływa znacząco na zwiększenie opóźnień dla różnych obciążeń. Problem może być rozwiązany za pomocą modyfikacji ustawień serwera WWW, pozwalających na skrócenie czasu reakcji
Modelagem de desempenho de servidores Web empregando a teoria Network Calculus
Orientadora: Cristina Duarte MurtaDissertaçao (mestrado) - Universidade Federal do Paraná, Setor de Ciencias Exatas, Programa de Pós-Graduaçao em Informática. Defesa: Curitiba, 2005Inclui bibliografiaResumo: Network Calculus 'e uma teoria que modela o desempenho de sistemas de filas e permite o c'alculo de limites determin'?sticos de desempenho, quando os fluxos de entrada obedecem a certas restri¸c˜oes. Este trabalho descreve a aplica¸c˜ao desta teoria para a modelagem de desempenho de servidores Web. O desempenho dos servidores Web 'e vital para o sucesso de muitas organiza¸c˜oes. Acompanhar, avaliar e modelar o desempenho dos servidores Web s˜ao tarefas fundamentais para prover acesso eficiente e confi'avel, em especial no caso de servidores populares e eventualmente sobrecarregados. Diferentes t'ecnicas est˜ao dispon'?veis para a avalia¸c˜ao de desempenho de um sistema. Cada uma apresenta possibilidades, vantagens e limita¸c˜oes e 'e aplic'avel em diferentes contextos e com custos diversos. Assim, estudar as possibilidades de aplica¸c˜ao de uma nova teoria para modelagem de desempenho de um sistema como os servidores Web 'e uma tarefa importante. Para demonstrar a aplica¸c˜ao da teoria e seus resultados em servidores Web, as fun¸c˜oes que descrevem os resultados do Network Calculus, que s˜ao fun¸c˜oes descritas na 'algebra min-plus, foram implementadas e testadas com registros de acesso de alguns servidores Web. Os principais resultados da teoria, a saber, o limite de atraso, o limite do tamanho da fila e o limite do fluxo de sa'?da, foram obtidos e s˜ao apresentados para os servidores em quest˜ao. Outros aspectos tamb'em discutidos s˜ao a compara¸c˜ao com a an'alise operacional, a aplica¸c˜ao da teoria para modelar controle de admiss˜ao e a complexidade computacional das fun¸c˜oes implementadas.Abstract: Network Calculus is a theory that models the perfomance of queueing systems. It provides deterministic bounds on performance, when the input flows obey certain restrictions. This work describes the application of this theory for the performance modeling of Web servers. The performance of Web servers is vital for the success of many organizations. To manage, evaluate and model the performance of Web servers are basic tasks to provide efficient and trustworthy access, in special in the case of popular and eventually overloaded servers. Different techniques are available for the performance evaluation of a system. Each one presents possibilities, advantages and limitations. The application of a new theory for modeling of performance of a system such as Web servers is an important task. To demostrate the application of the theory and its results in Web servers, the functions that describe the results of Network Calculus, which are described in min-plus algebra, have been implemented and tested with traces of some Web servers. The main results of the theory, which are the delay bound, the backlog bound and the output flow had been gotten and are presented for the servers considered. Other aspects of the theory also argued are the comparison with the operational analysis, the application of the theory to model admission control and the analysis of the computational complexity of the implemented functions
Recommended from our members
CheriOS: Designing an untrusted single-address-space capability operating system utilising capability hardware and a minimal hypervisor
This thesis presents the design, implementation, and evaluation of a novel capability operating system: CheriOS. The guiding motivation behind CheriOS is to provide strong security guarantees to programmers, even allowing them to continue to program in fast, but typically unsafe, languages such as C. Furthermore, it does this in the presence of an extremely strong adversarial model: in CheriOS, every compartment -- and even the operating system itself -- is considered actively malicious. Building on top of the architecturally enforced capabilities offered by the CHERI microprocessor, I show that only a few more capability types and enforcement checks are required to provide a strong compartmentalisation model that can facilitate mutual distrust. I implement these new primitives in software, in a new abstraction layer I dub the nanokernel. Among the new OS primitives I introduce are one for integrity and confidentiality called a Reservation (which allows allocating private memory without trusting the allocator), as well as another that can provide attestation about the state of the system, a Foundation (which provides a key to sign and protect capabilities based on a signature of the starting state of a program). I show that, using these new facilities, it is possible to design an operating system without having to trust the implementation is correct.
CheriOS is fundamentally fail-safe; there are no assumptions about the behaviour of the system, apart from the CHERI processor and the nanokernel, to be broken. Using CHERI and the new nanokernel primitives, programmers can expect full isolation at scopes ranging from a whole program to a single function, and not just with respect to other programs but the system itself. Programs compiled for and run on CheriOS offer full memory safety, both spatial and temporal, enforced control flow integrity between compartments and protection against common vulnerabilities such as buffer overflows, code injection and Return-Oriented-Programming attacks. I achieve this by designing a new CHERI-based ABI (Application Binary Interface) which includes a novel stack structure that offers temporal safety. I evaluate how practical the new designs are by prototyping them and offering a detailed performance evaluation. I also contrast with existing offerings from both industry and academia.
CHERI capabilities can be used to restrict access to system resources, such as memory, with the required dynamic checks being performed by hardware in parallel with normal operation. Using the accelerating features of CHERI, I show that many of the security guarantees that CheriOS offers can come at little to no cost. I present a novel and secure IO/IPC layer that allows secure marshalling of multiple data streams through mutually distrusting compartments, with fine-grained authenticated access control for end-points, and without either copying or encryption. For example, CheriOS can restrict its TCP stack from having access to packet contents, or restrict an open socket to ensure data sent on it to arrives at an endpoint signed as a TLS implementation. Even with added security requirements, CheriOS can perform well on real workloads. I showcase this by running a state-of-the-art webserver, NGINX, atop both CheriOS and FreeBSD and show improvements in performance ranging from 3x to 6x when running on a small-scale low-power FPGA implementation of CHERI-MIPS
TCP Connection Management Mechanisms for Improving Internet Server Performance
This thesis investigates TCP connection management mechanisms in order to understand the behaviour and improve the performance of Internet servers during overload conditions such as flash crowds. We study several alternatives for implementing TCP connection establishment, reviewing approaches taken by existing TCP stacks as well as proposing new mechanisms to improve server throughput and reduce client response times under overload. We implement some of these connection establishment mechanisms in the Linux TCP stack and evaluate their performance in a variety of environments. We also evaluate the cost of supporting half-closed connections at the server and assess the impact of an abortive release of connections by clients on the throughput of an overloaded server. Our evaluation demonstrates that connection establishment mechanisms that eliminate the TCP-level retransmission of connection attempts by clients increase server throughput by up to 40% and reduce client response times by two orders of magnitude. Connection termination mechanisms that preclude support for half-closed connections additionally improve server throughput by up to 18%
Performance Comparison of Uniprocessor and Multiprocessor Web Server Architectures
This thesis examines web-server architectures for static workloads on both uniprocessor and multiprocessor systems to determine the key factors affecting their performance. The architectures examined are event-driven (userver) and pipeline (WatPipe). As well, a thread-per-connection (Knot) architecture is examined for the uniprocessor system. Various workloads are tested to determine their effect on the performance of the servers. Significant effort is made to ensure a fair comparison among the servers. For example, all the servers are implemented in C or C++, and support sendfile and edge-triggered epoll.
The existing servers, Knot and userver, are extended as necessary, and the new pipeline-server, WatPipe, is implemented using userver as its initial code base. Each web server is also tuned to determine its best configuration for a specific workload, which is shown to be critical to achieve best server performance. Finally, the server experiments are verified to ensure each is performing within reasonable standards.
The performance of the various architectures is examined on a uniprocessor system. Three workloads are examined: no disk-I/O, moderate disk-I/O and heavy disk-I/O. These three workloads highlight the differences among the architectures. As expected, the experiments show the amount of disk I/O is the most significant factor in determining throughput, and once there is memory pressure, the memory footprint of the server is the crucial performance factor. The peak throughput differs by only 9-13% among the best servers of each architecture across the various workloads. Furthermore, the appropriate configuration parameters for best performance varied based on workload, and no single server performed the best for all workloads. The results show the event-driven and pipeline servers have equivalent throughput when there is moderate or no disk-I/O. The only difference is during the heavy disk-I/O experiments where WatPipe's smaller memory footprint for its blocking server gave it a performance advantage. The Knot server has 9% lower throughput for no disk-I/O and moderate disk-I/O and 13% lower for heavy disk-I/O, showing the extra overheads incurred by thread-per-connection servers, but still having performance close to the other server architectures.
An unexpected result is that blocking sockets with sendfile outperforms non-blocking sockets with sendfile when there is heavy disk-I/O because of more efficient disk access.
Next, the performance of the various architectures is examined on a multiprocessor system. Knot is excluded from the experiments as its underlying thread library, Capriccio, only supports uniprocessor execution. For these experiments, it is shown that partitioning the system so that server processes, subnets and requests are handled by the same CPU is necessary to achieve high throughput. Both N-copy and new hybrid versions of the uniprocessor servers, extended to support partitioning, are tested. While the N-copy servers performed the best, new hybrid versions of the servers also performed well.
These hybrid servers have throughput within 2% of the N-copy servers but offer benefits over N-copy such as a smaller memory footprint and a shared address-space.
For multiprocessor systems, it is shown that once the system becomes disk bound, the throughput of the servers is drastically reduced. To maximize performance on a multiprocessor, high disk throughput and lots of memory are essential
Performance Issues in WWW servers
Abstract This paper evaluates performance issues in WWW servers on UNIX-style platforms. While other work has focusedon reducing the use of kernel primitives, we explore ways in which the operating system and the network protocol stac
Performance Issues in WWW Servers
This paper evaluates performance issues in WWW servers on UNIX-style platforms. While other work has focused on reducing the use of kernel primitives, we consider ways in which the operating system and the network protocol stack can improve support for high-performance WWW servers. We study techniques in 3 categories: new socket functions, per-byte optimizations, and per-connection optimizations. We examine two proposed socket functions, acceptex() and send file(), comparing send file()'s effectiveness with an mmap()/writev()combination. We show how send file()provides the necessary semantic support to eliminate copies and checksums in the kernel, and quantify the utility of the function's header and close options. We also present mechanisms to reduce the number of packets exchanged in an HTTP transaction, both increasing server performance and reducing network utilization, without compromising interoperability. We evaluate these issues with a high-performance WWW server, using IBM AIX ..