182 research outputs found
SchemaAnalyst: Search-Based Test Data Generation for Relational Database Schemas
Data stored in relational databases plays a vital role
in many aspects of society. When this data is incorrect, the
services that depend on it may be compromised. The database
schema is the artefact responsible for maintaining the integrity
of stored data. Because of its critical function, the proper testing
of the database schema is a task of great importance. Employing
a search-based approach to generate high-quality test data for
database schemas, SchemaAnalyst is a tool that supports testing
this key software component. This presented tool is extensible
and includes both an evaluation framework for assessing the
quality of the generated tests and full-featured documentation.
In addition to describing the design and implementation of
SchemaAnalyst and overviewing its efficiency and effectiveness,
this paper coincides with the tool’s public release, thereby enhancing
practitioners’ ability to test relational database schemas
Sandboxed, Online Debugging of Production Bugs for SOA Systems
Short time-to-bug localization is extremely important for any 24x7 service-oriented application. To this end, we introduce a new debugging paradigm called live debugging. There are two goals that any live debugging infrastructure must meet: Firstly, it must offer real-time insight for bug diagnosis and localization, which is paramount when errors happen in user-facing applications. Secondly, live debugging should not impact user-facing performance for normal events. In large distributed applications, bugs which impact only a small percentage of users are common. In such scenarios, debugging a small part of the application should not impact the entire system.
With the above-stated goals in mind, this thesis presents a framework called Parikshan, which leverages user-space containers (OpenVZ) to launch application instances for the express purpose of live debugging. Parikshan is driven by a live-cloning process, which generates a replica (called debug container) of production services, cloned from a production container which continues to provide the real output to the user. The debug container provides a sandbox environment, for safe execution of monitoring/debugging done by the users without any perturbation to the execution environment. As a part of this framework, we have designed customized-network proxies, which replicate inputs from clients to both the production and test-container, as well safely discard all outputs. Together the network duplicator, and the debug container ensure both compute and network isolation of the debugging environment. We believe that this piece of work provides the first of its kind practical real-time debugging of large multi-tier and cloud applications, without requiring any application downtime, and minimal performance impact
Approach for testing the extract-transform-load process in data warehouse systems, An
2018 Spring.Includes bibliographical references.Enterprises use data warehouses to accumulate data from multiple sources for data analysis and research. Since organizational decisions are often made based on the data stored in a data warehouse, all its components must be rigorously tested. In this thesis, we first present a comprehensive survey of data warehouse testing approaches, and then develop and evaluate an automated testing approach for validating the Extract-Transform-Load (ETL) process, which is a common activity in data warehousing. In the survey we present a classification framework that categorizes the testing and evaluation activities applied to the different components of data warehouses. These approaches include both dynamic analysis as well as static evaluation and manual inspections. The classification framework uses information related to what is tested in terms of the data warehouse component that is validated, and how it is tested in terms of various types of testing and evaluation approaches. We discuss the specific challenges and open problems for each component and propose research directions. The ETL process involves extracting data from source databases, transforming it into a form suitable for research and analysis, and loading it into a data warehouse. ETL processes can use complex one-to-one, many-to-one, and many-to-many transformations involving sources and targets that use different schemas, databases, and technologies. Since faulty implementations in any of the ETL steps can result in incorrect information in the target data warehouse, ETL processes must be thoroughly validated. In this thesis, we propose automated balancing tests that check for discrepancies between the data in the source databases and that in the target warehouse. Balancing tests ensure that the data obtained from the source databases is not lost or incorrectly modified by the ETL process. First, we categorize and define a set of properties to be checked in balancing tests. We identify various types of discrepancies that may exist between the source and the target data, and formalize three categories of properties, namely, completeness, consistency, and syntactic validity that must be checked during testing. Next, we automatically identify source-to-target mappings from ETL transformation rules provided in the specifications. We identify one-to-one, many-to-one, and many-to-many mappings for tables, records, and attributes involved in the ETL transformations. We automatically generate test assertions to verify the properties for balancing tests. We use the source-to-target mappings to automatically generate assertions corresponding to each property. The assertions compare the data in the target data warehouse with the corresponding data in the sources to verify the properties. We evaluate our approach on a health data warehouse that uses data sources with different data models running on different platforms. We demonstrate that our approach can find previously undetected real faults in the ETL implementation. We also provide an automatic mutation testing approach to evaluate the fault finding ability of our balancing tests. Using mutation analysis, we demonstrated that our auto-generated assertions can detect faults in the data inside the target data warehouse when faulty ETL scripts execute on mock source data
Autonomous Recovery in Componentized Internet Applications
In this paper we show how to reduce downtime of J2EE applications by rapidly and automatically recovering from transient and intermittent software failures, without requiring application modifications. Our prototype combines three application-agnostic techniques: macroanalysis for fault detection and localization, microrebooting for rapid recovery, and external management of recovery actions. The individual techniques are autonomous and work across a wide range of componentized Internet applications, making them well-suited to the rapidly changing software of Internet services. The proposed framework has been integrated with JBoss, an open-source J2EE application server. Our prototype provides an execution platform that can automatically recover J2EE applications within seconds of the manifestation of a fault. Our system can provide a subset of a system's active end users with the illusion of continuous uptime, in spite of failures occurring behind the scenes, even when there is no functional redundancy in the system
PerfCE: Performance Debugging on Databases with Chaos Engineering-Enhanced Causality Analysis
Debugging performance anomalies in real-world databases is challenging.
Causal inference techniques enable qualitative and quantitative root cause
analysis of performance downgrade. Nevertheless, causality analysis is
practically challenging, particularly due to limited observability. Recently,
chaos engineering has been applied to test complex real-world software systems.
Chaos frameworks like Chaos Mesh mutate a set of chaos variables to inject
catastrophic events (e.g., network slowdowns) to "stress" software systems. The
systems under chaos stress are then tested using methods like differential
testing to check if they retain their normal functionality (e.g., SQL query
output is always correct under stress). Despite its ubiquity in the industry,
chaos engineering is now employed mostly to aid software testing rather for
performance debugging.
This paper identifies novel usage of chaos engineering on helping developers
diagnose performance anomalies in databases. Our presented framework, PERFCE,
comprises an offline phase and an online phase. The offline phase learns the
statistical models of the target database system, whilst the online phase
diagnoses the root cause of monitored performance anomalies on the fly. During
the offline phase, PERFCE leverages both passive observations and proactive
chaos experiments to constitute accurate causal graphs and structural equation
models (SEMs). When observing performance anomalies during the online phase,
causal graphs enable qualitative root cause identification (e.g., high CPU
usage) and SEMs enable quantitative counterfactual analysis (e.g., determining
"when CPU usage is reduced to 45\%, performance returns to normal"). PERFCE
notably outperforms prior works on common synthetic datasets, and our
evaluation on real-world databases, MySQL and TiDB, shows that PERFCE is highly
accurate and moderately expensive
Performance Rubrics for Robustness Evaluation of Web Mutation Operators
Web Applications are the predominant medium for not only business enterprises but also for service-based sector to establish and continue their online presence. However, the robustness of web application is mandatory in seamless interaction with customers for achieving sustainable business. Intruders and unethical hackers keep trying to gain unauthentic access to the web applications and hence it is more necessary for the web application to be resistant against any such attacks. The strength of a web application is indirectly responsible for gaining customer confidence leading to repeat business as well as attracting new customers for profitable longer run. Once the web application gains credibility it is bound to run successfully. In the current work, an attempt has been made to assess the robustness of mutation operators used to test web applications is made. A few rubrics have been proposed to ascertain the strength of projected mutation operators verified on some sample open-source web applications. The functional attributes of a web application are the functionalities offered by the web application. The non-functional attributes of a typical web application are security, performance, availability. Here, web applications are challenged against the afore mentioned non-functional attributes using rubrics like uniformity, uniqueness, reliability, unpredictability, and entropy. A comprehensive analysis has been made for the robustness of the projected web operators against the designed and formulated rubrics
Recommended from our members
Leveraging Distributed Tracing and Container Cloning for Replay Debugging of Microservices
Microservice architectures have gained prominence in recent years for building large-scale industrial distributed systems. However, microservice architectures make the usage of replay debugging, a powerful technique for finding root causes of faults, very challenging because of the polyglot (written in several languages) services, large accumulated state of services, and tight latency limits imposed by long hop-chains. This work attempts to provide a framework for enabling replay debugging in production microservice applications. We study 25 real-world faults in microservice systems collected from diverse sources, categorize these faults by fault symptoms, and create 15 application agnostic mutation operators for microservices. We then propose a language agnostic replay debugging framework for microservice applications that uses a distributed tracing system to record network requests and enables replay of those requests on cloned service containers running in a debug environment. A key component of this framework is an anomaly detector that uses span-level and container-level monitoring to detect fault symptoms found in our study and localizes faults to trace level so that faulty traces can be easily replayed to find the root cause. An open-source microservices application injected successively with the mutation operators is used for an evaluation that shows that our framework is upto an order of magnitude lighter-weight than language-specific recording tools such as Chrome DevTools or VisualVM and can help in finding root causes of 9 out of 15 mutations at a line or function level
- …