2,586 research outputs found

    A Structured Cloud-Based Software Testing Model with a Case Study Implementation

    Get PDF
    Cloud-based testing methodologies were gaining significant popularity and adoption in the software testing industry. Cloud-based testing offers several advantages, such as scalability, flexibility, cost-effectiveness, and the ability to access a wide range of testing tools and environments without the need for extensive infrastructure setup. Cloud testing methods are having challenges with respect to testing priority, practical use cases, performance, lengthy test time, integrating and streamlining, data security, etc. since they are addressing specific purposes. To address these challenges, there is a need for a structured testing model with respect to the cloud environment. This article proposes a new structured cloud-based testing model for enhancing the testing service in the cloud environment. The proposed model addresses the order of testing and the priority, data security, and performance by using Smoke and Sanity testing methods

    Cross-Layer Cloud Performance Monitoring, Analysis and Recovery

    Get PDF
    The basic idea of Cloud computing is to offer software and hardware resources as services. These services are provided at different layers: Software (Software as a Service: SaaS), Platform (Platform as a Service: PaaS) and Infrastructure (Infrastructure as a Service: IaaS). In such a complex environment, performance issues are quite likely and rather the norm than the exception. Consequently, performance-related problems may frequently occur at all layers. Thus, it is necessary to monitor all Cloud layers and analyze their performance parameters to detect and rectify related problems. This thesis presents a novel cross-layer reactive performance monitoring approach for Cloud computing environments, based on the methodology of Complex Event Processing (CEP). The proposed approach is called CEP4Cloud. It analyzes monitored events to detect performance-related problems and performs actions to fix them. The proposal is based on the use of (1) a novel multi-layer monitoring approach, (2) a new cross-layer analysis approach and (3) a novel recovery approach. The proposed monitoring approach operates at all Cloud layers, while collecting related parameters. It makes use of existing monitoring tools and a new monitoring approach for Cloud services at the SaaS layer. The proposed SaaS monitoring approach is called AOP4CSM. It is based on aspect-oriented programming and monitors quality-of-service parameters of the SaaS layer in a non-invasive manner. AOP4CSM neither modifies the server implementation nor the client implementation. The defined cross-layer analysis approach is called D-CEP4CMA. It is based on the methodology of Complex Event Processing (CEP). Instead of having to manually specify continuous queries on monitored event streams, CEP queries are derived from analyzing the correlations between monitored metrics across multiple Cloud layers. The results of the correlation analysis allow us to reduce the number of monitored parameters and enable us to perform a root cause analysis to identify the causes of performance-related problems. The derived analysis rules are implemented as queries in a CEP engine. D-CEP4CMA is designed to dynamically switch between different centralized and distributed CEP architectures depending on the load/memory of the CEP machine and network traffic conditions in the observed Cloud environment. The proposed recovery approach is based on a novel action manager framework. It applies recovery actions at all Cloud layers. The novel action manager framework assigns a set of repair actions to each performance-related problem and checks the success of the applied action. The results of several experiments illustrate the merits of the reactive performance monitoring approach and its main components (i.e., monitoring, analysis and recovery). First, experimental results show the efficiency of AOP4CSM (very low overhead). Second, obtained results demonstrate the benefits of the analysis approach in terms of precision and recall compared to threshold-based methods. They also show the accuracy of the analysis approach in identifying the causes of performance-related problems. Furthermore, experiments illustrate the efficiency of D-CEP4CMA and its performance in terms of precision and recall compared to centralized and distributed CEP architectures. Moreover, experimental results indicate that the time needed to fix a performance-related problem is reasonably short. They also show that the CPU overhead of using CEP4Cloud is negligible. Finally, experimental results demonstrate the merits of CEP4Cloud in terms of speeding up the repair and reducing the number of triggered alarms compared to baseline methods

    Chaos Engineering for Microservices

    Get PDF
    Chaos engineering is a relatively new concept that is growing in popularity as it helps companies to be more resilient in the face of unexpected networking or software failure. The idea behind chaos engineering is that if you can create controlled failures, you can discover where your system is weak and then fix those weaknesses before something happens to your production environment. This research has been done on microservices, which are small pieces of code that perform specific tasks on behalf of a larger application. Microservices are often hosted on different servers and run by different teams, so they are much more fragile than monolithic applications. Microservices also tend to be written in different languages, which makes them more difficult to understand and test for bugs. The goal of this study was to determine whether microservices can be made more resilient through chaos engineering or not; specifically, if it is possible to find out what kinds of failures occur most often and how long they take to resolve

    Analyzing challenging aspects of IPv6 over IPv4

    Get PDF
    The exponential expansion of the Internet has exhausted the IPv4 addresses provided by IANA. The new IP edition, i.e. IPv6 introduced by IETF with new features such as a simplified packet header, a greater address space, a different address sort, improved encryption, powerful section routing, and stronger QoS. ISPs are slowly seeking to migrate from current IPv4 physical networks to new generation IPv6 networks. ‎The move from actual IPv4 to software-based IPv6 is very sluggish, since billions of computers across the globe use IPv4 addresses. The configuration and actions of IP4 and IPv6 protocols are distinct. Direct correspondence between IPv4 and IPv6 is also not feasible. In terms of the incompatibility problems, all protocols can co-exist throughout the transformation for a few years. Compatibility, interoperability, and stability are key concerns between IP4 and IPv6 protocols. After the conversion of the network through an IPv6, the move causes several issues for ISPs. The key challenges faced by ISPs are packet traversing, routing scalability, performance reliability, and protection. Within this study, we meticulously analyzed a detailed overview of all aforementioned issues during switching into ipv6 network

    DEPENDABILITY BENCHMARKING OF NETWORK FUNCTION VIRTUALIZATION

    Get PDF
    Network Function Virtualization (NFV) is an emerging networking paradigm that aims to reduce costs and time-to-market, improve manageability, and foster competition and innovative services. NFV exploits virtualization and cloud computing technologies to turn physical network functions into Virtualized Network Functions (VNFs), which will be implemented in software, and will run as Virtual Machines (VMs) on commodity hardware located in high-performance data centers, namely Network Function Virtualization Infrastructures (NFVIs). The NFV paradigm relies on cloud computing and virtualization technologies to provide carrier-grade services, i.e., the ability of a service to be highly reliable and available, within fast and automatic failure recovery mechanisms. The availability of many virtualization solutions for NFV poses the question on which virtualization technology should be adopted for NFV, in order to fulfill the requirements described above. Currently, there are limited solutions for analyzing, in quantitative terms, the performance and reliability trade-offs, which are important concerns for the adoption of NFV. This thesis deals with assessment of the reliability and of the performance of NFV systems. It proposes a methodology, which includes context, measures, and faultloads, to conduct dependability benchmarks in NFV, according to the general principles of dependability benchmarking. To this aim, a fault injection framework for the virtualization technologies has been designed and implemented for the virtualized technologies being used as case studies in this thesis. This framework is successfully used to conduct an extensive experimental campaign, where we compare two candidate virtualization technologies for NFV adoption: the commercial, hypervisor-based virtualization platform VMware vSphere, and the open-source, container-based virtualization platform Docker. These technologies are assessed in the context of a high-availability, NFV-oriented IP Multimedia Subsystem (IMS). The analysis of experimental results reveal that i) fault management mechanisms are crucial in NFV, in order to provide accurate failure detection and start the subsequent failover actions, and ii) fault injection proves to be valuable way to introduce uncommon scenarios in the NFVI, which can be fundamental to provide a high reliable service in production

    A proactive fault tolerance framework for high performance computing (HPC) systems in the cloud

    Get PDF
    High Performance Computing (HPC) systems have been widely used by scientists and researchers in both industry and university laboratories to solve advanced computation problems. Most advanced computation problems are either data-intensive or computation-intensive. They may take hours, days or even weeks to complete execution. For example, some of the traditional HPC systems computations run on 100,000 processors for weeks. Consequently traditional HPC systems often require huge capital investments. As a result, scientists and researchers sometimes have to wait in long queues to access shared, expensive HPC systems. Cloud computing, on the other hand, offers new computing paradigms, capacity, and flexible solutions for both business and HPC applications. Some of the computation-intensive applications that are usually executed in traditional HPC systems can now be executed in the cloud. Cloud computing price model eliminates huge capital investments. However, even for cloud-based HPC systems, fault tolerance is still an issue of growing concern. The large number of virtual machines and electronic components, as well as software complexity and overall system reliability, availability and serviceability (RAS), are factors with which HPC systems in the cloud must contend. The reactive fault tolerance approach of checkpoint/restart, which is commonly used in HPC systems, does not scale well in the cloud due to resource sharing and distributed systems networks. Hence, the need for reliable fault tolerant HPC systems is even greater in a cloud environment. In this thesis we present a proactive fault tolerance approach to HPC systems in the cloud to reduce the wall-clock execution time, as well as dollar cost, in the presence of hardware failure. We have developed a generic fault tolerance algorithm for HPC systems in the cloud. We have further developed a cost model for executing computation-intensive applications on HPC systems in the cloud. Our experimental results obtained from a real cloud execution environment show that the wall-clock execution time and cost of running computation-intensive applications in the cloud can be considerably reduced compared to checkpoint and redundancy techniques used in traditional HPC systems

    Achieving network resiliency using sound theoretical and practical methods

    Get PDF
    Computer networks have revolutionized the life of every citizen in our modern intercon- nected society. The impact of networked systems spans every aspect of our lives, from financial transactions to healthcare and critical services, making these systems an attractive target for malicious entities that aim to make financial or political profit. Specifically, the past decade has witnessed an astounding increase in the number and complexity of sophisti- cated and targeted attacks, known as advanced persistent threats (APT). Those attacks led to a paradigm shift in the security and reliability communities’ perspective on system design; researchers and government agencies accepted the inevitability of incidents and malicious attacks, and marshaled their efforts into the design of resilient systems. Rather than focusing solely on preventing failures and attacks, resilient systems are able to maintain an acceptable level of operation in the presence of such incidents, and then recover gracefully into normal operation. Alongside prevention, resilient system design focuses on incident detection as well as timely response. Unfortunately, the resiliency efforts of research and industry experts have been hindered by an apparent schism between theory and practice, which allows attackers to maintain the upper hand advantage. This lack of compatibility between the theory and practice of system design is attributed to the following challenges. First, theoreticians often make impractical and unjustifiable assumptions that allow for mathematical tractability while sacrificing accuracy. Second, the security and reliability communities often lack clear definitions of success criteria when comparing different system models and designs. Third, system designers often make implicit or unstated assumptions to favor practicality and ease of design. Finally, resilient systems are tested in private and isolated environments where validation and reproducibility of the results are not publicly accessible. In this thesis, we set about showing that the proper synergy between theoretical anal- ysis and practical design can enhance the resiliency of networked systems. We illustrate the benefits of this synergy by presenting resiliency approaches that target the inter- and intra-networking levels. At the inter-networking level, we present CPuzzle as a means to protect the transport control protocol (TCP) connection establishment channel from state- exhaustion distributed denial of service attacks (DDoS). CPuzzle leverages client puzzles to limit the rate at which misbehaving users can establish TCP connections. We modeled the problem of determining the puzzle difficulty as a Stackleberg game and solve for the equilibrium strategy that balances the users’ utilizes against CPuzzle’s resilience capabilities. Furthermore, to handle volumetric DDoS attacks, we extend CPuzzle and implement Midgard, a cooperative approach that involves end-users in the process of tolerating and neutralizing DDoS attacks. Midgard is a middlebox that resides at the edge of an Internet service provider’s network and uses client puzzles at the IP level to allocate bandwidth to its users. At the intra-networking level, we present sShield, a game-theoretic network response engine that manipulates a network’s connectivity in response to an attacker who is moving laterally to compromise a high-value asset. To implement such decision making algorithms, we leverage the recent advances in software-defined networking (SDN) to collect logs and security alerts about the network and implement response actions. However, the programma- bility offered by SDN comes with an increased chance for design-time bugs that can have drastic consequences on the reliability and security of a networked system. We therefore introduce BiFrost, an open-source tool that aims to verify safety and security proper- ties about data-plane programs. BiFrost translates data-plane programs into functionally equivalent sequential circuits, and then uses well-established hardware reduction, abstrac- tion, and verification techniques to establish correctness proofs about data-plane programs. By focusing on those four key efforts, CPuzzle, Midgard, sShield, and BiFrost, we believe that this work illustrates the benefits that the synergy between theory and practice can bring into the world of resilient system design. This thesis is an attempt to pave the way for further cooperation and coordination between theoreticians and practitioners, in the hope of designing resilient networked systems
    • …
    corecore