Relational into Non-Relational Database Migration with Multiple-Nested Schema Methods on Academic Data

Abstract

The rapid development of internet technology has increased the need of data storage and processing technology application. One application is to manage academic data records at educational institutions. Along with massive growth of information, decrement in the traditional database performance is inevitable. Hence, there are many companies choose to migrate to NoSQL, a technology that is able to overcome the traditional database shortcomings. However, the existing SQL to NoSQL migration tools have not been able to represent SQL data relations in NoSQL without limiting query performance. In this paper, a relational database transformation system transforming MySQL into non-relational database MongoDB was developed, using the Multiple Nested Schema method for academic databases. The development began with a transformation scheme design. The transformation scheme was then implemented in the migration process, using PDI/Kettle. The testing was carried out on three aspects, namely query response time, data integrity, and storage requirements. The test results showed that the developed system successfully represented the relationship of SQL data in NoSQL, provided complex query performance 13.32 times faster in the migration database, basic query performance involving SQL transaction tables 28.6 times faster on migration results, and basic performance Queries without involving SQL transaction tables were 3.91 times faster in the migration source. This shows that the theory of the Multiple Nested Schema method, aiming to overcome the poor performance of queries involving many JOIN operations, is proved. In addition, the system is also proven to be able to maintain data integrity in all tested queries. The space performance test results indicated that the migrated database transformed using the Multiple Nested Schema method showed a storage requirement of 10.53 times larger than the migration source database. This is due to the large amount of data redundancy resulting from the transformation process. However, at present, storage performance is not a top priority in data processing technology, so large storage requirements are a consequence of obtaining efficient query performance, which is still considered as the first priority in data processing technology

    Similar works