    Self-configured Elastic Database with Deep Q-Learning Approach

    Elastic databases have grown in popularity over conventional databases in recent years due to their ability to be allocated with sufficient capacity for peak load. Especially with the support of the cloud platform, which provides flexible resources and low cost, elastic databases on the cloud show their excellent potential in scalability, flexibility, and accessibility. However, the interaction between the cloud layers of virtual machines (VMs) and databases further complicates the issue of cloud configuration to adapt to dynamic workloads. In this paper, I explore a framework for a self-configured elastic database that can optimize the cloud configuration and adaptively allocate resources under the constraints of databases\u27 Service Level Agreement (SLA). At the core of the framework is a Deep Q learning approach, which combines the advantages of Reinforcement Learning (RL) and Deep Learning (DL). The framework is built on Amazon Web Service (AWS)\u27s cloud environment and uses MySQL database for its high availability replication mechanism. Experimental results on the TPC-W benchmark demonstrate that with the implementation of Deep Q learning, the elastic database reduces SLA violation by more than 90\%, in the response to the steep slope of workload change

    The End of Slow Networks: It's Time for a Redesign

    Next generation high-performance RDMA-capable networks will require a fundamental rethinking of the design and architecture of modern distributed DBMSs. These systems are commonly designed and optimized under the assumption that the network is the bottleneck: the network is slow and "thin", and thus needs to be avoided as much as possible. Yet this assumption no longer holds true. With InfiniBand FDR 4x, the bandwidth available to transfer data across network is in the same ballpark as the bandwidth of one memory channel, and it increases even further with the most recent EDR standard. Moreover, with the increasing advances of RDMA, the latency improves similarly fast. In this paper, we first argue that the "old" distributed database design is not capable of taking full advantage of the network. Second, we propose architectural redesigns for OLTP, OLAP and advanced analytical frameworks to take better advantage of the improved bandwidth, latency and RDMA capabilities. Finally, for each of the workload categories, we show that remarkable performance improvements can be achieved

    Fast Distributed Transactions for Partitioned Database Systems.

    ABSTRACT Many distributed storage systems achieve high data access throughput via partitioning and replication, each system with its own advantages and tradeoffs. In order to achieve high scalability, however, today's systems generally reduce transactional support, disallowing single transactions from spanning multiple partitions. Calvin is a practical transaction scheduling and data replication layer that uses a deterministic ordering guarantee to significantly reduce the normally prohibitive contention costs associated with distributed transactions. Unlike previous deterministic database system prototypes, Calvin supports disk-based storage, scales near-linearly on a cluster of commodity machines, and has no single point of failure. By replicating transaction inputs rather than effects, Calvin is also able to support multiple consistency levels-including Paxosbased strong consistency across geographically distant replicas-at no cost to transactional throughput

    Evaluation of ACE properties of traditional SQL and NoSQL big data systems

    Traditional SQL and NoSQL big data systems are the backbone for managing data in cloud, fog and edge computing. This paper develops a new system and adopts the TPC-DS industry standard benchmark in order to evaluate three key properties, availability, consistency and efficiency (ACE) of SQL and NoSQL systems. The contributions of this work are manifold. It evaluates and analyses the tradeoff between the ACE properties. It provides insight into the NoSQL systems and how they can be improved to be sustainable for a more wide range of applications. The evaluation shows that SQL provides stronger consistency, but at the expense of low efficiency and availability. NoSQL provides better efficiency and availability but lacks support for stronger consistency. In order for NoSQL systems to be more sustainable they need to implement transactional schemes that enforce stronger consistency as well as better efficiency and availability

    Testing of transactional services in NoSQL key-value databases

    Transactional services guarantee the consistency of shared data during the concurrent execution of multiple applications. They have been used in various domains ranging from classical databases through to service-oriented computing systems to NoSQL databases and cloud. Though transactional services aim to ensure data consistency, NoSQL databases prioritize efficiency/availability over data consistency. In order to address these issues various transaction models and protocols have been proposed in the literature. However, testing of transactions in NoSQL database has not been addressed. In this paper, we investigate into the testing of transactional services in NoSQL databases in order to test and analyse the data consistency by taking into account the characteristics of NoSQL databases such as efficiency, velocity, etc. Accordingly, we develop a framework for testing transactional services in NoSQL databases. The novelty and contributions are that we develop a context-aware transactional model that takes into account contextual requirements of NoSQL clients and the system level setting in relation to the data consistency. This can assist NoSQL application developers in choosing between transactional and non-transactional services based on their requirements of the level of data consistency. The framework also provides ways to analyse the impact of the big data requirements and characteristics (e.g., velocity, efficiency) on the data consistency of NoSQL databases. The evaluation and testing are carried out using a widely used NoSQL key/value database, Riak, and a real (open) and big data from the Council of London for public transportation of the London bus services

    Consistency in scalable systems

    Scalable data management in distributed information systems

    Segurança em serviços de banco de dados em nuvem: controles para acordos de níveis de serviços

    Dissertação (mestrado) - Universidade Federal de Santa Catarina, Centro Tecnológico, Programa de Pós-Graduação em Ciência da Computação, Florianópolis, 2013.Computação em nuvem surgiu como meio para economia de recursos através do compartilhamento de estruturas em sistemas distribuídos. Dentre os diversos modelos de entrega de serviços em nuvem estão os bancos de dados. No entanto, em ambientes corporativos, a segurança das aplicações com bancos de dados em nuvem, torna-se uma preocupação. Desde 1997, trabalhos de pesquisa vêm sendo desenvolvidos com o objetivo de minimizar alguns dos diversos problemas de segurança apontados, principalmente os relativos aos requisitos de confidencialidade. Esta dissertação focaliza o problema de segurança que pode ser encontrado, quando se celebram acordos de níveis de serviço (SLA) e contratos de serviço para bancos de dados em nuvem. No sentido de averiguar a segurança e tratar riscos, é proposto a utilização de um framework conceitual construído com um conjunto de controles internos para orientar clientes e provedores no estabelecimento de níveis de segurança. Com a utilização deste framework, controles internos foram implantados em ambientes de laboratório para nuvens públicas e privadas. Estudos de caso e análise de vulnerabilidades foram realizados para verificação da segurança, permitindo obter importantes resultados, tais como: viabilizar a escolha de provedores de serviços que possuem os controle desejados; criar métricas para o monitoramento de serviços; adequar e utilizar o PCMONS (Private Cloud MONitoring System) para realizar o monitoramento; integrar controles aos contratos de serviços e fiscalizar acordos de níveis de serviços.Abstract : Cloud computing enables resource savings through IT infrastructure sharing in distributed systems. Databases are one of cloud service delivery models. However, application security in cloud databases is a concern in enterprise environments. Since 1997, scientific researches have been developed in or- der to minimize some of these security problems, especially those relating to confidentiality. This master thesis focuses on the security problem, which can be found when entering into service level agreements (SLA) and service contracts. In order to investigate the security levels and risk treatment, we have proposed the use of a conceptual framework built with a set of internal controls to guide clients and providers for establishing acceptable security levels. By using this framework, internal controls were implemented in laboratory environment for public and private clouds. Case studies and vulnerabilities analysis were executed in order to investigate the security assurance. Important results were achieved, such as: to guide the choice of service providers holding the desired controls; create metrics for monitoring services; adapt and use of PCMONS (Private Cloud MONitoring System) monitoring tool; include controls into service contracts and manage service level agreements

    Referential Integrity in Cloud NoSQL Databases

    Cloud computing delivers on-demand access to essential computing services providing benefits such as reduced maintenance, lower costs, global access, and others. One of its important and prominent services is Database as a Service (DaaS) which includes cloud Database Management Systems (DBMSs). Cloud DBMSs commonly adopt the key-value data model and are called Not only SQL (NoSQL) DBMSs. These provide cloud suitable features like scalability, flexibility and robustness, but in order to provide these, features such as referential integrity are often sacrificed. In such cases, referential integrity is left to be dealt with by the applications instead of being handled by the cloud DBMSs. Thus, applications are required to either deal with inconsistency in the data (e.g. dangling references) or to incorporate the necessary logic to ensure that referential integrity is maintained. This thesis presents an Application Programming Interface (API) that serves as a middle layer between the applications and the cloud DBMS in order to maintain referential integrity. The API provides the necessary Create, Read, Update and Delete (CRUD) operations to be performed on the DBMS while ensuring that the referential integrity constraints are satisfied. These constraints are represented as metadata and four different approaches are provided to store it. Furthermore, the performance of these approaches is measured with different referential integrity constraints and evaluated upon a set of experiments in Apache Cassandra, a prominent cloud NoSQL DBMS. The results showed significant differences between the approaches in terms of performance. However, the final word on which one is better depends on the application demands as each approach presents different trade-offs