3 research outputs found

    Workload-aware incremental repartitioning of shared-nothing distributed databases for scalable OLTP applications

    No full text
    On-line Transaction Processing (OLTP) applications often rely on shared-nothing distributed databases that can sustain rapid growth in data volume. Distributed transactions (DTs) that involve data tuples from multiple geo-distributed servers can adversely impact the performance of such databases, especially when the transactions are short-lived and these require immediate responses. The. k-way min-cut graph clustering based database repartitioning algorithms can be used to reduce the number of DTs with acceptable level of load balancing. Web applications, where DT profile changes over time due to dynamically varying workload patterns, frequent database repartitioning is needed to keep up with the change. This paper addresses this emerging challenge by introducing incremental repartitioning. In each repartitioning cycle, DT profile is learnt online and. k-way min-cut clustering algorithm is applied on a special sub-graph representing all DTs as well as those non-DTs that have at least one tuple in a DT. The latter ensures that the min-cut algorithm minimally reintroduces new DTs from the non-DTs while maximally transforming existing DTs into non-DTs in the new partitioning. Potential load imbalance risk is mitigated by applying the graph clustering algorithm on the finer logical partitions instead of the servers and relying on random one-to-one cluster-to-partition mapping that naturally balances out loads. Inter-server data-migration due to repartitioning is kept in check with two special mappings favouring the current partition of majority tuples in a cluster-the many-to-one version minimising data migrations alone and the one-to-one version reducing data migration without affecting load balancing. A distributed data lookup process, inspired by the roaming protocol in mobile networks, is introduced to efficiently handle data migration without affecting scalability. The effectiveness of the proposed framework is evaluated on realistic TPC-C workloads comprehensively using graph, hypergraph, and compressed hypergraph representations used in the literature. To compare the performance of any incremental repartitioning framework without any bias of the external min-cut algorithm due to graph size variations, a transaction generation model is developed that can maintain a target number of unique transactions in any arbitrary observation window, irrespective of new transaction arrival rate. The overall impact of DTs at any instance is estimated from the exponential moving average of the recurrence period of unique transactions to avoid transient fluctuations. The effectiveness and adaptability of the proposed incremental repartitioning framework for transactional workloads have been established with extensive simulations on both range partitioned and consistent hash partitioned databases. © 2015 Elsevier B.V

    CAP Theorem: Revision of its related consistency models

    Get PDF
    [EN] The CAP theorem states that only two of these properties can be simultaneously guaranteed in a distributed service: (i) consistency, (ii) availability, and (iii) network partition tolerance. This theorem was stated and proved assuming that "consistency" refers to atomic consistency. However, multiple consistency models exist and atomic consistency is located at the strongest edge of that spectrum. Many distributed services deployed in cloud platforms should be highly available and scalable. Network partitions may arise in those deployments and should be tolerated. One way of dealing with CAP constraints consists in relaxing consistency. Therefore, it is interesting to explore the set of consistency models not supported in an available and partition-tolerant service (CAP-constrained models). Other weaker consistency models could be maintained when scalable services are deployed in partitionable systems (CAP-free models). Three contributions arise: (1) multiple other CAP-constrained models are identified, (2) a borderline between CAP-constrained and CAP-free models is set, and (3) a hierarchy of consistency models depending on their strength and convergence is built.Muñoz-EscoĂ­, FD.; Juan MarĂ­n, RD.; GarcĂ­a Escriva, JR.; GonzĂĄlez De MendĂ­vil Moreno, JR.; Bernabeu AubĂĄn, JM. (2019). CAP Theorem: Revision of its related consistency models. The Computer Journal. 62(6):943-960. https://doi.org/10.1093/comjnl/bxy142S943960626Davidson, S. B., Garcia-Molina, H., & Skeen, D. (1985). Consistency in a partitioned network: a survey. ACM Computing Surveys, 17(3), 341-370. doi:10.1145/5505.5508Gilbert, S., & Lynch, N. (2002). Brewer’s conjecture and the feasibility of consistent, available, partition-tolerant web services. ACM SIGACT News, 33(2), 51-59. doi:10.1145/564585.564601Muñoz-EscoĂ­, F. D., & BernabĂ©u-AubĂĄn, J. M. (2016). A survey on elasticity management in PaaS systems. Computing, 99(7), 617-656. doi:10.1007/s00607-016-0507-8Brewer, E. (2012). CAP twelve years later: How the «rules» have changed. Computer, 45(2), 23-29. doi:10.1109/mc.2012.37Attiya, H., Ellen, F., & Morrison, A. (2017). Limitations of Highly-Available Eventually-Consistent Data Stores. IEEE Transactions on Parallel and Distributed Systems, 28(1), 141-155. doi:10.1109/tpds.2016.2556669Viotti, P., & Vukolić, M. (2016). Consistency in Non-Transactional Distributed Storage Systems. ACM Computing Surveys, 49(1), 1-34. doi:10.1145/2926965Burckhardt, S. (2014). Principles of Eventual Consistency. Foundations and TrendsÂź in Programming Languages, 1(1-2), 1-150. doi:10.1561/2500000011Herlihy, M. P., & Wing, J. M. (1990). Linearizability: a correctness condition for concurrent objects. ACM Transactions on Programming Languages and Systems, 12(3), 463-492. doi:10.1145/78969.78972Lamport. (1979). How to Make a Multiprocessor Computer That Correctly Executes Multiprocess Programs. IEEE Transactions on Computers, C-28(9), 690-691. doi:10.1109/tc.1979.1675439Ladin, R., Liskov, B., Shrira, L., & Ghemawat, S. (1992). Providing high availability using lazy replication. ACM Transactions on Computer Systems, 10(4), 360-391. doi:10.1145/138873.138877Yu, H., & Vahdat, A. (2002). Design and evaluation of a conit-based continuous consistency model for replicated services. ACM Transactions on Computer Systems, 20(3), 239-282. doi:10.1145/566340.566342Curino, C., Jones, E., Zhang, Y., & Madden, S. (2010). Schism. Proceedings of the VLDB Endowment, 3(1-2), 48-57. doi:10.14778/1920841.1920853Das, S., Agrawal, D., & El Abbadi, A. (2013). ElasTraS. ACM Transactions on Database Systems, 38(1), 1-45. doi:10.1145/2445583.2445588Chen, Z., Yang, S., Tan, S., He, L., Yin, H., & Zhang, G. (2014). A new fragment re-allocation strategy for NoSQL database systems. Frontiers of Computer Science, 9(1), 111-127. doi:10.1007/s11704-014-3480-4Kamal, J., Murshed, M., & Buyya, R. (2016). Workload-aware incremental repartitioning of shared-nothing distributed databases for scalable OLTP applications. Future Generation Computer Systems, 56, 421-435. doi:10.1016/j.future.2015.09.024Elghamrawy, S. M., & Hassanien, A. E. (2017). A partitioning framework for Cassandra NoSQL database using Rendezvous hashing. The Journal of Supercomputing, 73(10), 4444-4465. doi:10.1007/s11227-017-2027-5Muñoz-EscoĂ­, F. D., GarcĂ­a-EscrivĂĄ, J.-R., Sendra-Roig, J. S., BernabĂ©u-AubĂĄn, J. M., & GonzĂĄlez de MendĂ­vil, J. R. (2018). Eventual Consistency: Origin and Support. Computing and Informatics, 37(5), 1037-1072. doi:10.4149/cai_2018_5_1037Fischer, M. J., Lynch, N. A., & Paterson, M. S. (1985). Impossibility of distributed consensus with one faulty process. Journal of the ACM, 32(2), 374-382. doi:10.1145/3149.21412

    Proactive & time-optimized data synopsis management at the edge

    Get PDF
    Internet of Things offers the infrastructure for smooth functioning of autonomous context-aware devices being connected towards the Cloud. Edge Computing (EC) relies between the IoT and Cloud providing significant advantages. One advantage is to perform local data processing (limited latency, bandwidth preservation) with real time communication among IoT devices, while multiple nodes become hosts of the collected data (reported by IoT devices). In this work, we provide a mechanism for the exchange of data synopses (summaries of extracted knowledge) among EC nodes that are necessary to give the knowledge on the data present in EC environments. The overarching aim is to intelligently decide on when nodes should exchange data synopses in light of efficient execution of tasks. We enhance such a decision with a stochastic optimization model based on the Theory of Optimal Stopping. We provide the fundamentals of our model and the relevant formulations on the optimal time to disseminate data synopses to network edge nodes. We report a comprehensive experimental evaluation and comparative assessment related to the optimality achieved by our model and the positive effects on EC
    corecore