63 research outputs found

    Big Data Testing Techniques: Taxonomy, Challenges and Future Trends

    Full text link
    Big Data is reforming many industrial domains by providing decision support through analyzing large data volumes. Big Data testing aims to ensure that Big Data systems run smoothly and error-free while maintaining the performance and quality of data. However, because of the diversity and complexity of data, testing Big Data is challenging. Though numerous research efforts deal with Big Data testing, a comprehensive review to address testing techniques and challenges of Big Data is not available as yet. Therefore, we have systematically reviewed the Big Data testing techniques evidence occurring in the period 2010-2021. This paper discusses testing data processing by highlighting the techniques used in every processing phase. Furthermore, we discuss the challenges and future directions. Our findings show that diverse functional, non-functional and combined (functional and non-functional) testing techniques have been used to solve specific problems related to Big Data. At the same time, most of the testing challenges have been faced during the MapReduce validation phase. In addition, the combinatorial testing technique is one of the most applied techniques in combination with other techniques (i.e., random testing, mutation testing, input space partitioning and equivalence testing) to find various functional faults through Big Data testing.Comment: 32 page

    Scaling Genetic Algorithms to Large Distributed Datasets

    Get PDF
    Analysing large-scale data brings promises of new levels of scientific discovery and economic value. However, the fact that such a volume of data is by its nature distributed and the need for new computational methods to be effective in the face of significant changes in data complexity and size has led to the need to develop large-scale data analytics. Genetic algorithms (GAs) have proven their flexibility in many application areas, and substantial research has been dedicated to improving their performance through parallelisation. In contrast with most previous efforts, we reject approaches based on the centralisation of data in the main memory of a single node or requiring remote access to shared/distributed memory. We focus instead on scenarios where data is partitioned across machines. In this partitioned scenario, we explore two parallelisation models: PDMS, inspired by the traditional master-slave model, and PDMD, based on island models. We adopt the two models to distribute BioHEL, a popular large-scale single-node GA classifier, using the Spark distributed data processing platform. We investigate the effect of GA control parameters (population size and migration frequency). We study the accuracy, time performance and scalability of the proposed models. Our results show that our distributed genetic algorithm design provides a good tradeoff between accuracy and time

    Scaling Genetic Algorithms to Large Distributed Datasets

    Get PDF
    Analysing large-scale data brings promises of new levels of scientific discovery and economic value. However, the fact that such volume of data is by its nature distributed and the need for new computational methods to be effective in the face of significant changes in data complexity and size has led to the need to develop large-scale data analytics. Genetic algorithms (GAs) have proven their flexibility in many application areas, and substantial research has been dedicated to improving their performance through parallelisation. In contrast with most previous efforts, we reject approaches based on the centralisation of data in the main memory of a single node or requiring remote access to shared/distributed memory. We focus instead on scenarios where data is partitioned across machines. In this partitioned scenario, we explore two parallelisation models: PDMS, inspired by the traditional master-slave model, and PDMD, based on island models. We adopt the two models to distribute BioHEL, a popular large-scale single-node GA classifier, using the Spark distributed data processing platform. We investigate the effect of GA control parameters (population size and migration frequency).We study the accuracy, time performance and scalability of the proposed models. Our results show that our distributed genetic algorithm design provides a good tradeoff between accuracy and time. We then extend the two models using automatic termination and population sizing to enhance the distributed genetic algorithm ease-of-use. Moreover, after testing this strategy on both models, we show that the applied automation offers a promising enhancement on the performance of the initially designed GA models

    PV System Design and Performance

    Get PDF
    Photovoltaic solar energy technology (PV) has been developing rapidly in the past decades, leading to a multi-billion-dollar global market. It is of paramount importance that PV systems function properly, which requires the generation of expected energy both for small-scale systems that consist of a few solar modules and for very large-scale systems containing millions of modules. This book increases the understanding of the issues relevant to PV system design and correlated performance; moreover, it contains research from scholars across the globe in the fields of data analysis and data mapping for the optimal performance of PV systems, faults analysis, various causes for energy loss, and design and integration issues. The chapters in this book demonstrate the importance of designing and properly monitoring photovoltaic systems in the field in order to ensure continued good performance

    Proceedings of the Second International Workshop on Sustainable Ultrascale Computing Systems (NESUS 2015) Krakow, Poland

    Get PDF
    Proceedings of: Second International Workshop on Sustainable Ultrascale Computing Systems (NESUS 2015). Krakow (Poland), September 10-11, 2015

    On the Combination of Game-Theoretic Learning and Multi Model Adaptive Filters

    Get PDF
    This paper casts coordination of a team of robots within the framework of game theoretic learning algorithms. In particular a novel variant of fictitious play is proposed, by considering multi-model adaptive filters as a method to estimate other players’ strategies. The proposed algorithm can be used as a coordination mechanism between players when they should take decisions under uncertainty. Each player chooses an action after taking into account the actions of the other players and also the uncertainty. Uncertainty can occur either in terms of noisy observations or various types of other players. In addition, in contrast to other game-theoretic and heuristic algorithms for distributed optimisation, it is not necessary to find the optimal parameters a priori. Various parameter values can be used initially as inputs to different models. Therefore, the resulting decisions will be aggregate results of all the parameter values. Simulations are used to test the performance of the proposed methodology against other game-theoretic learning algorithms.</p

    Advances in Evolutionary Algorithms

    Get PDF
    With the recent trends towards massive data sets and significant computational power, combined with evolutionary algorithmic advances evolutionary computation is becoming much more relevant to practice. Aim of the book is to present recent improvements, innovative ideas and concepts in a part of a huge EA field

    Targeting C-terminal binding proteins (CtBPs) using genetic selection

    No full text
    There are many protein-protein interactions that are vital for cellular processes such as signal transduction, structural organisation and apoptosis. In this study we decipher the role of the protein-protein interaction of C terminal Binding Proteins (CtBPs). CtBPs function as transcriptional co-repressors in the nucleus playing key roles in tumorigenesis and metastasis by regulating cellular processes, critical to cell survival, cell migration and senescence. CtBP proteins also play a role in the cytoplasm in regulating mitotic Golgi membrane fission Studies in which the expression or function of CtBPs has been inhibited have independently identified roles for CtBPs in both suppressing apoptosis and promoting cell cycle progression. Modulation of these interactions with small molecules is a potential therapeutic strategy with benefits over current methods. Our approach in studying protein-protein interactions and uncovering potential inhibitors involves constructing a bacterial Reverse Two Hybrid System (RTHS) linking the dimerisation of the target protein partners to the expression of reporter genes, whose regulation can be monitored via host survival. Subsequent screening of a cyclic peptide library for potential inhibitors was then carried out. The libraries were produced using Split Intein-mediated Circular Ligation Of Peptides and Proteins (SICLOPPS) technology, developed for intracellular synthesis of cyclic peptides. We have used this methodology to identify inhibitors of CtBP dimerisation and better understand the roles of this protein interaction in cell cycle regulation. Chapter 1 provides an introduction to the work carried out to study protein-protein interactions and finding potential inhibitors. Since our investigations involved the extensive use of the RTHS and SICLOPPS system, the background and work performed by others has been described in detail. A detailed review of CtBPs has also been carried out. Chapter 2 details our work investigating the homodimeric and heterodimeric protein-protein interaction of CtBPs using the RTHS. This work allowed us to optimise selection conditions and find cyclic peptide inhibitors of the homodimerisation of CtBP1 and CtBP2 using the SICLOPPS process. The synthesis of these inhibitors is described. Chapter 3 details our work carried out to develop ELISAs for in vitro analysis of the selected cyclic peptides. This involved the purification of His- and GST-tagged CtBP1 and CtBP2 proteins. The ELISA conditions were optimised to carry out CtBP homodimeric and hetrodimeric analysis. This work showed that the peptides lead to a reduction in CtBP homdimerisation and heterodimerisation in vitro. Chapter 4 details the in vivo effects of the uncovered CtBP dimerisation inhibitors. Using these cyclic peptide inhibitors we have demonstrated that CtBP dimerisation is essential for the regulation of mitotic fidelity, and that inhibition of CtBP dimerisation by the cyclic peptides leads to aberrant segregation of chromosomes during mitosis. We have also shown that inhibition of CtBP dimerisation leads to a reduction in migration of MCF-7 breast cancer cells. Chapter 5 details the experimental procedures used in this work and presents spectroscopic and analytical data for the compounds prepared

    Toxicological profile for tetrachloroethylene (PERC)

    Get PDF
    cdc:26478CAS#: 127-18-4A Toxicological Profile for Tetrachloroethylene, Draft for Public Comment was released in October 2014. This edition supersedes any previously released draft or final profile.CS274127-Atp18.pdf2019642
    corecore