58 research outputs found

    Big Data Testing Techniques: Taxonomy, Challenges and Future Trends

    Full text link
    Big Data is reforming many industrial domains by providing decision support through analyzing large data volumes. Big Data testing aims to ensure that Big Data systems run smoothly and error-free while maintaining the performance and quality of data. However, because of the diversity and complexity of data, testing Big Data is challenging. Though numerous research efforts deal with Big Data testing, a comprehensive review to address testing techniques and challenges of Big Data is not available as yet. Therefore, we have systematically reviewed the Big Data testing techniques evidence occurring in the period 2010-2021. This paper discusses testing data processing by highlighting the techniques used in every processing phase. Furthermore, we discuss the challenges and future directions. Our findings show that diverse functional, non-functional and combined (functional and non-functional) testing techniques have been used to solve specific problems related to Big Data. At the same time, most of the testing challenges have been faced during the MapReduce validation phase. In addition, the combinatorial testing technique is one of the most applied techniques in combination with other techniques (i.e., random testing, mutation testing, input space partitioning and equivalence testing) to find various functional faults through Big Data testing.Comment: 32 page

    Data Science and Ethical Issues:Between Knowledge Gain and Ethical Responsibility

    Get PDF
    Despite the numerous possibilities and advantages of data science to solve complex problems and gain new insights, the appropriate way of using and analyzing data, especially in today’s technologically dependent society, continues to face ethical questions and challenges. Although ethics in relation to computer science has been a topic of discussion since the 1950s, the topic has only recently joined the data science debate. Nonetheless, an overall consent or a common conceptual framework for ethics in data science is still nonexistent. In particular, privacy rights, data validity, and algorithm fairness in the areas of Big Data, Artificial Intelligence, and Machine Learning are the most important ethical challenges in need of a more thorough investigation. Thus, this chapter contributes to the overall discussion by providing an overview of current ethical challenges that are not only crucial for data science in general but also for the tourism industry in the future.</p

    Towards Ex Vivo Testing of MapReduce Applications

    Get PDF
    2017 IEEE International Conference on Software Quality, Reliability and Security (QRS), 25-29 July 2017, Prague (Czech Republic)Big Data programs are those that process large data exceeding the capabilities of traditional technologies. Among newly proposed processing models, MapReduce stands out as it allows the analysis of schema-less data in large distributed environments with frequent infrastructure failures. Functional faults in MapReduce are hard to detect in a testing/preproduction environment due to its distributed characteristics. We propose an automatic test framework implementing a novel testing approach called Ex Vivo. The framework employs data from production but executes the tests in a laboratory to avoid side-effects on the application. Faults are detected automatically without human intervention by checking if the same data would generate different outputs with different infrastructure configurations. The framework (MrExist) is validated with a real-world program. MrExist can identify a fault in a few seconds, then the program can be stopped, not only avoiding an incorrect output, but also saving money, time and energy of production resource

    Towards Ex Vivo Testing of MapReduce Applications

    Full text link
    corecore