50 research outputs found
Big Data Testing Techniques: Taxonomy, Challenges and Future Trends
Big Data is reforming many industrial domains by providing decision support
through analyzing large data volumes. Big Data testing aims to ensure that Big
Data systems run smoothly and error-free while maintaining the performance and
quality of data. However, because of the diversity and complexity of data,
testing Big Data is challenging. Though numerous research efforts deal with Big
Data testing, a comprehensive review to address testing techniques and
challenges of Big Data is not available as yet. Therefore, we have
systematically reviewed the Big Data testing techniques evidence occurring in
the period 2010-2021. This paper discusses testing data processing by
highlighting the techniques used in every processing phase. Furthermore, we
discuss the challenges and future directions. Our findings show that diverse
functional, non-functional and combined (functional and non-functional) testing
techniques have been used to solve specific problems related to Big Data. At
the same time, most of the testing challenges have been faced during the
MapReduce validation phase. In addition, the combinatorial testing technique is
one of the most applied techniques in combination with other techniques (i.e.,
random testing, mutation testing, input space partitioning and equivalence
testing) to find various functional faults through Big Data testing.Comment: 32 page
Data Science and Ethical Issues:Between Knowledge Gain and Ethical Responsibility
Despite the numerous possibilities and advantages of data science to solve complex problems and gain new insights, the appropriate way of using and analyzing data, especially in today’s technologically dependent society, continues to face ethical questions and challenges. Although ethics in relation to computer science has been a topic of discussion since the 1950s, the topic has only recently joined the data science debate. Nonetheless, an overall consent or a common conceptual framework for ethics in data science is still nonexistent. In particular, privacy rights, data validity, and algorithm fairness in the areas of Big Data, Artificial Intelligence, and Machine Learning are the most important ethical challenges in need of a more thorough investigation. Thus, this chapter contributes to the overall discussion by providing an overview of current ethical challenges that are not only crucial for data science in general but also for the tourism industry in the future.</p
Towards Ex Vivo Testing of MapReduce Applications
2017 IEEE International Conference on Software Quality, Reliability and Security (QRS), 25-29 July 2017, Prague (Czech Republic)Big Data programs are those that process large data exceeding the capabilities of traditional technologies. Among newly proposed processing models, MapReduce stands out as it allows the analysis of schema-less data in large distributed environments with frequent infrastructure failures. Functional faults in MapReduce are hard to detect in a testing/preproduction environment due to its distributed characteristics. We propose an automatic test framework implementing a novel testing approach called Ex Vivo. The framework employs data from production but executes the tests in a laboratory to avoid side-effects on the application. Faults are detected automatically without human intervention by checking if the same data would generate different outputs with different infrastructure configurations. The framework (MrExist) is validated with a real-world program. MrExist can identify a fault in a few seconds, then the program can be stopped, not only avoiding an incorrect output, but also saving money, time and energy of production resource