7 research outputs found
Evolution Management in NoSQL Document Databases
NoSQL databases are widely used for many applications as a technology for data storage, and their usage and popularity rises. The first aim of the thesis is to research the existing approaches and technologies for schema evolution in NoSQL databases. Next, we introduce an approach for schema evolution in multi-model databases with a unified interface for the most common data models. The proposed approach is easy to use and covers the common migration scenarios. We have also implemented a prototype, optimized its read/write operations, and demonstrated its properties on real-world data. 1NoSQL databáze jsou široce pouĹľĂvány v mnoha aplikacĂch jako technolo- gie pro ukládánĂ dat. Jejich pouĹľitĂ a popularita stále roste. PrvnĂm cĂlem tĂ©to práce je shromáždit a analyzovat existujĂcĂ Ĺ™ešenĂ a technologie pro evoluci schĂ©ma v NoSQL databázĂch. Dále pĹ™edstavujeme obecnĂ© Ĺ™ešenĂ pro evoluci schĂ©ma v multi-model databázĂch s jednotnĂ˝m rozhranĂm pro nejběžnÄ›jšà da- tovĂ© modely. NavrĹľenĂ© Ĺ™ešenĂ je jednoduše pouĹľitelnĂ© a pokrĂ˝vá scĂ©náře běžnĂ˝ch migracĂ. SoučástĂ práce je i implementace optimalizovanĂ©ho prototypu navrho- vanĂ©ho pĹ™Ăstupu a demonstrace jeho vlastnostĂ na datech z reálnĂ©ho svÄ›ta. 1Katedra softwarovĂ©ho inĹľenĂ˝rstvĂDepartment of Software EngineeringFaculty of Mathematics and PhysicsMatematicko-fyzikálnĂ fakult
Data migration between different data models of NOSQL databases
Orientador : Marcos Didonet Del FabroDissertação (mestrado) - Universidade Federal do Paraná, Setor de CiĂŞncias Exatas, Programa de PĂłs-Graduação em Informática. Defesa: Curitiba, 17/02/2017Inclui referĂŞncias : f. 76-79Resumo: Desde sua origem, as bases de dados Nosql tĂŞm alcançado um uso generalizado. Devido Ă falta de padrões de desenvolvimento nesta nova tecnologia emergem grandes desafios. Existem modelos de dados , linguagens de acesso e frameworks heterogĂŞneos, o que torna a migração de dados ainda mais complexa. A maior parte das soluções disponĂveis hoje se concentra em fornecer uma representação abstrata e genĂ©rica para todos os modelos de dados. Essas soluções se concentram em adaptadores para acessar homogeneamente os dados, mas nĂŁo para implementar especificamente transformações entre eles. Essas abordagens muitas vezes precisam de um framework para acessar os dados, o que pode impedir de usá-los em alguns cenários. Entre estes desafios, a migração de dados entre as várias soluções revelou-se particularmente difĂcil. Esta dissertação propõe a criação de um metamodelo e uma sĂ©rie de regras capazes de auxiliar na tarefa de migração de dados. Os dados podem ser convertidos para vários formatos desejados atravĂ©s de um estado intermediário. Para validar a solução foram realizados vários testes com diversos sistemas e utilizando dados reais disponĂveis. Palavras Chave: NoSql Databases. Metamodelo. Migração de Dados.Abstract: Since its origin the NoSql Database have achieved widespread use. Due to the lack of standards for development in this new technology great challenges emerges. Among these challenges, the data migration between the various solutions has proved particularly difficult. There are heterogeneous datamodels, access languages and frameworks available, which makes data migration even more complex. Most part of the solutions available today focus on providing an abstract and generic representation for all data models. These solutions focus in design adapters to homogeneously access the data, but not to specifically implement transformations between them. These approaches often need a framework to access the data, which may prevent from using them in some scenarios. This dissertation proposes the creation of a metamodel and a series of rules capable of assisting in the data migration task. The data can be converted to various desired formats through an intermediate state. To validate the solution several tests were performed with different systems and using real data available. Key-words: NoSql Databases. Metamodel. Data Migration
Multi-Schema-Version Data Management
Modern agile software development methods allow to continuously evolve software systems by easily adding new features, fixing bugs, and adapting the software to changing requirements and conditions while it is continuously used by the users. A major obstacle in the agile evolution is the underlying database that persists the software system’s data from day one on. Hence, evolving the database schema requires to evolve the existing data accordingly—at this point, the currently established solutions are very expensive and error-prone and far from agile.
In this thesis, we present InVerDa, a multi-schema-version database system to facilitate agile database development. Multi-schema-version database systems provide multiple schema versions within the same database, where each schema version itself behaves like a regular single-schema database. Creating new schema versions is very simple to provide the desired agility for database development. All created schema versions can co-exist and write operations are immediately propagated between schema versions with a best-effort strategy. Developers do not have to implement the propagation logic of data accesses between schema versions by hand, but InVerDa automatically generates it.
To facilitate multi-schema-version database systems, we equip developers with a relational complete and bidirectional database evolution language (BiDEL) that allows to easily evolve existing schema versions to new ones. BiDEL allows to express the evolution of both the schema and the data both forwards and backwards in intuitive and consistent operations; the BiDEL evolution scripts are orders of magnitude shorter than implementing the same behavior with standard SQL and are even less likely to be erroneous, since they describe a developer’s intention of the evolution exclusively on the level of tables without further technical details. Having the developers’ intentions explicitly given in the BiDEL scripts further allows to create a new schema version by merging already existing ones.
Having multiple co-existing schema versions in one database raises the need for a sophisticated physical materialization. Multi-schema-version database systems provide full data independence, hence the database administrator can choose a feasible materialization, whereby the multi-schema-version database system internally ensures that no data is lost. The search space of possible materializations can grow exponentially with the number of schema versions. Therefore, we present an adviser that releases the database administrator from diving into the complex performance characteristics of multi-schema-version database systems and merely proposes an optimized materialization for a given workload within seconds. Optimized materializations have shown to improve the performance for a given workload by orders of magnitude.
We formally guarantee data independence for multi-schema-version database systems. To this end, we show that every single schema version behaves like a regular single-schema database independent of the chosen physical materialization. This important guarantee allows to easily evolve and access the database in agile software development—all the important features of relational databases, such as transaction guarantees, are preserved. To the best of our knowledge, we are the first to realize such a multi-schema-version database system that allows agile evolution of production databases with full support of co-existing schema versions and formally guaranteed data independence