819 research outputs found

    Approach for testing the extract-transform-load process in data warehouse systems, An

    Get PDF
    2018 Spring.Includes bibliographical references.Enterprises use data warehouses to accumulate data from multiple sources for data analysis and research. Since organizational decisions are often made based on the data stored in a data warehouse, all its components must be rigorously tested. In this thesis, we first present a comprehensive survey of data warehouse testing approaches, and then develop and evaluate an automated testing approach for validating the Extract-Transform-Load (ETL) process, which is a common activity in data warehousing. In the survey we present a classification framework that categorizes the testing and evaluation activities applied to the different components of data warehouses. These approaches include both dynamic analysis as well as static evaluation and manual inspections. The classification framework uses information related to what is tested in terms of the data warehouse component that is validated, and how it is tested in terms of various types of testing and evaluation approaches. We discuss the specific challenges and open problems for each component and propose research directions. The ETL process involves extracting data from source databases, transforming it into a form suitable for research and analysis, and loading it into a data warehouse. ETL processes can use complex one-to-one, many-to-one, and many-to-many transformations involving sources and targets that use different schemas, databases, and technologies. Since faulty implementations in any of the ETL steps can result in incorrect information in the target data warehouse, ETL processes must be thoroughly validated. In this thesis, we propose automated balancing tests that check for discrepancies between the data in the source databases and that in the target warehouse. Balancing tests ensure that the data obtained from the source databases is not lost or incorrectly modified by the ETL process. First, we categorize and define a set of properties to be checked in balancing tests. We identify various types of discrepancies that may exist between the source and the target data, and formalize three categories of properties, namely, completeness, consistency, and syntactic validity that must be checked during testing. Next, we automatically identify source-to-target mappings from ETL transformation rules provided in the specifications. We identify one-to-one, many-to-one, and many-to-many mappings for tables, records, and attributes involved in the ETL transformations. We automatically generate test assertions to verify the properties for balancing tests. We use the source-to-target mappings to automatically generate assertions corresponding to each property. The assertions compare the data in the target data warehouse with the corresponding data in the sources to verify the properties. We evaluate our approach on a health data warehouse that uses data sources with different data models running on different platforms. We demonstrate that our approach can find previously undetected real faults in the ETL implementation. We also provide an automatic mutation testing approach to evaluate the fault finding ability of our balancing tests. Using mutation analysis, we demonstrated that our auto-generated assertions can detect faults in the data inside the target data warehouse when faulty ETL scripts execute on mock source data

    Optimization Of Fuzzy Logic Controllers With Genetic Algorithm For Two-Part-Type And Re-Entrant Production Systems

    Get PDF
    Improvement in the performance of production control systems is so important that many of past studies were dedicated to this problem. The applicability of fuzzy logic controllers (FLCs) in production control systems has been shown in the past literature. Furthermore, genetic algorithm (GA) has been used to optimize the FLCs performance. This is addressed as genetic fuzzy logic controller (GFLC). The GFLC methodology is used to develop two production control architectures named “genetic distributed fuzzy” (GDF), and “genetic supervisory fuzzy” (GSF) controllers. These control architectures have been applied to single-part-type production systems. In their new application, the GDF and GSF controllers are developed to control multipart- type and re-entrant production systems. In multi-part-type and re-entrant production systems the priority of production as well as the production rate for each part type is determined by production control systems. A genetic algorithm is developed to tune the membership functions (MFs) of input variables of GDF and GSF controllers. The objective function of the GSF controller is to minimize the overall production cost based on work-in-process (WIP) and backlog cost, while surplus minimization is considered in GDF controller. The GA module is programmed in MATLAB® software. The performance of each GDF or GSF controllers in controlling the production system model is evaluated using Simulink® software. The performance indices are used as chromosomes ranking criteria. The optimized GDF and GSF can be used in real implementations. GDF and GSF controllers are evaluated for two test cases namely “two-part-type production line” and “re-entrant production system”. The results have been compared with two heuristic controllers namely “heuristic distributed fuzzy” (HDF) and “heuristic supervisory fuzzy” (HSF) controllers. The results showed that GDF and GSF controllers can improve the performance of production system. In GSF control architecture, WIP level is 30% decreased rather than HSF controllers. Moreover the overall production cost is reduced in most of the test cases by 30%. GDF controllers show their abilities in reducing the backlog level but generally production cost for GDF controller is greater than GSF controller

    Cortical development: Cdk5 gets into sticky situations

    Get PDF
    AbstractCyclin-dependent kinase 5 (Cdk5) is much more than its name implies; it plays a role in neuronal migration, neurite outgrowth and degeneration. Recent evidence suggests that Cdk5 regulates neuronal adhesion and cytoskeletal dynamics

    Microwave processing of meat

    Get PDF

    Preparation and physicochemical characterization of meloxicam orally fast disintegration tablet using its solid dispersion

    Get PDF
    Meloxicam (MLX) is a non-steroidal, anti-inflammatory drug that is prescribed in the treatment of rheumatoid arthritis and osteoarthritis. MLX is practically insoluble in water and exhibits a slow onset of action. In this study, MLX solid dispersions (MLX SDs) were prepared to improve the water solubility of this poorly water-soluble drug. Then orally disintegrating tablets (ODT) of MLX were developed using MLX SD to decrease the onset of action of this drug. MLX, poloxamer 188, and crospovidone of different ratios were melted in molten poloxamer 188 as a hydrophilic carrier. The optimum SD with the highest saturation solubility in water (13.09±0.34 microgram/mL) consisting of MLX: poloxamer 188: crospovidone in the ratio of 1:2:0 was used for the preparation of MLX ODTs. MLX ODTs were prepared by the direct compression method and optimized by the 23 factorial design. The effect of the superdisintegrant concentration, the mannitol-avicel ratio, and the level of compression force on the disintegration time, hardness, and percent of dissolved MLX from MLX ODTs after 30 min was evaluated. DSC and XRD analysis approved an amorphous form of MLX in SDs. The optimized ODT formulation containing 10% of superdisintegrant, and mannitol and avicel in the ratio of 4:1 respectively was compressed using a high level of compression force. The optimized ODT showed hardness (34.37±2.1 N) and friability (1.26±0.04%). This formulation showed a rapid disintegration in 12.66±2.5 seconds, which 82.66±5.1% of the MLX released within 30 min. MLX ODTs, prepared from MLX SD, could be introduced as a suitable dosage form of MLX with improved solubility and the onset of action
    corecore