15 research outputs found

    Data consistency: toward a terminological clarification

    Full text link
    The final publication is available at Springer via http://dx.doi.org/10.1007/978-3-319-21413-9_15Consistency is an inconsistency are ubiquitous term in data engineering. Its relevance to quality is obvious, since consistency is a commonplace dimension of data quality. However, connotations are vague or ambiguous. In this paper, we address semantic consistency, transaction consistency, replication consistency, eventual consistency and the new notion of partial consistency in databases. We characterize their distinguishing properties, and also address their differences, interactions and interdependencies. Partial consistency is an entry door to living with inconsistency, which is an ineludible necessity in the age of big data.Decker and F.D. Muñoz—supported by the Spanish MINECO grant TIN 2012-37719-C03-01.Decker, H.; Muñoz Escoí, FD.; Misra, S. (2015). Data consistency: toward a terminological clarification. En Computational Science and Its Applications -- ICCSA 2015: 15th International Conference, Banff, AB, Canada, June 22-25, 2015, Proceedings, Part V. Springer International Publishing. 206-220. https://doi.org/10.1007/978-3-319-21413-9_15S206220Abadi, D.: Consistency tradeoffs in modern distributed database system design: Cap is only part of the story. Computer 45(2), 37–42 (2012)Bailis, P. (2015). http://www.bailis.org/blog/Bailis, P., Ghodsi, A.: Eventual consistency today: limitations, extensions, and beyond. ACM Queue, 11(3) (2013)Balegas, V., Duarte, S., Ferreira, C., Rodrigues, R., Preguica, N., Najafzadeh, M., Shapiro, M.: Putting consistency back into eventual consistency. In: 10th EuroSys. ACM (2015). http://dl.acm.org/citation.cfm?doid=2741948.2741972Beeri, C., Bernstein, P., Goodman, N.: A sophisticate’s introduction to database normalization theory. In: VLDB, pp. 113–124 (1978)Berenson, H., Bernstein, P., Gray, J., Melton, J., O’Neil, E., O’Neil, P.: A critique of ansi sql isolation levels. SIGMoD Record 24(2), 1–10 (1995)Bermbach, D., Tai, S.: Eventual consistency: how soon is eventual? In: 6th MW4SOC. ACM (2011)Bernabé-Gisbert, J., Muñoz-Escoí, F.: Supporting multiple isolation levels in replicated environments. Data & Knowledge Engineering 7980, 1–16 (2012)Bernstein, P., Das, S.. Rethinking eventual consistency. In: SIGMOD 2013, pp. 923–928. ACM (2013)Bernstein, P., Hadzilacos, V., Goodman, N.: Concurrency Control and Recovery in Database Systems. Addison-Wesley (1987)Bertossi, L., Hunter, A., Schaub, T.: Inconsistency Tolerance. In: Bertossi, L., Hunter, A., Schaub, T. (eds.) Inconsistency Tolerance. LNCS, vol. 3300, pp. 1–14. Springer, Heidelberg (2005)Bobenrieth, A.: Inconsistencias por qué no? Un estudio filosófico sobre la lógica paraconsistente. Premios Nacionales Colcultura. Tercer Mundo Editores. Magister Thesis, Universidad de los Andes, Santafé de Bogotá, Columbia (1995)Bosneag, A.-M., Brockmeyer, M.: A formal model for eventual consistency semantics. In: PDCS 2002, pp. 204–209. IASTED (2001)Browne, J.: Brewer’s cap theorem (2009). http://www.julianbrowne.com/article/viewer/brewers-cap-theoremCong, G., Fan, W., Geerts, F., Jia, X., Ma, S.: Improving data quality: consistency and accuracy. In: Proc. 33rd VLDB, pp. 315–326. ACM (2007)Dechter, R., van Beek, P.: Local and global relational consistency. Theor. Comput. Sci. 173(1), 283–308 (1997)Decker, H.: Translating advanced integrity checking technology to SQL. In: Doorn, J., Rivero, L. (eds.) Database integrity: challenges and solutions, pp. 203–249. Idea Group (2002)Decker, H.: Historical and computational aspects of paraconsistency in view of the logic foundation of databases. In: Bertossi, L., Katona, G.O.H., Schewe, K.-D., Thalheim, B. (eds.) Semantics in Databases 2001. LNCS, vol. 2582, pp. 63–81. Springer, Heidelberg (2003)Decker, H.: Answers that have integrity. In: Schewe, K.-D., Thalheim, B. (eds.) SDKB 2010. LNCS, vol. 6834, pp. 54–72. Springer, Heidelberg (2011)Decker, H.: New measures for maintaining the quality of databases. In: Murgante, B., Gervasi, O., Misra, S., Nedjah, N., Rocha, A.M.A.C., Taniar, D., Apduhan, B.O. (eds.) ICCSA 2012, Part IV. LNCS, vol. 7336, pp. 170–185. Springer, Heidelberg (2012)Decker, H.: A pragmatic approach to model, measure and maintain the quality of information in databases (2012). www.iti.upv.es/~hendrik/papers/ahrc-workshop_quality-of-data.pdf , www.iti.upv.es/~hendrik/papers/ahrc-workshop_quality-of-data_comments.pdf . Slides and comments presented at the Workshop on Information Quality. Univ, Hertfordshire, UKDecker, H.: Answers that have quality. In: Murgante, B., Misra, S., Carlini, M., Torre, C.M., Nguyen, H.-Q., Taniar, D., Apduhan, B.O., Gervasi, O. (eds.) ICCSA 2013, Part II. LNCS, vol. 7972, pp. 543–558. Springer, Heidelberg (2013)Decker, H.: Measure-based inconsistency-tolerant maintenance of database integrity. In: Schewe, K.-D., Thalheim, B. (eds.) SDKB 2013. LNCS, vol. 7693, pp. 149–173. Springer, Heidelberg (2013)Decker, H., Martinenghi, D.: Inconsistency-tolerant integrity checking. IEEE Transactions of Knowledge and Data Engineering 23(2), 218–234 (2011)Decker, H., Muñoz-Escoí, F.D.: Revisiting and improving a result on integrity preservation by concurrent transactions. In: Meersman, R., Dillon, T., Herrero, P. (eds.) OTM 2010. LNCS, vol. 6428, pp. 297–306. Springer, Heidelberg (2010)Dong, X.L., Berti-Equille, L., Srivastava, D.: Data fusion: resolving conflicts from multiple sources (2015). http://arxiv.org/abs/1503.00310Eswaran, K., Gray, J., Lorie, R., Traiger, I.: The notions of consistency and predicate locks in a database system. CACM 19(11), 624–633 (1976)Muñoz-Escoí, F.D., Ruiz-Fuertes, M.I., Decker, H., Armendáriz-Íñigo, J.E., de Mendívil, J.R.G.: Extending middleware protocols for database replication with integrity support. In: Meersman, R., Tari, Z. (eds.) OTM 2008, Part I. LNCS, vol. 5331, pp. 607–624. Springer, Heidelberg (2008)Fekete, A.: Consistency models for replicated data. In: Encyclopedia of Database Systems, pp. 450–451. Springer (2009)Fekete, A., Gupta, D., Lynch, V., Luchangco, N., Shvartsman, A.: Eventually-serializable data services. In: 15th PoDC, pp. 300–309. ACM (1996)Gilbert, S., Lynch, N.: Brewer’s conjecture and the feasibility of consistent, available, partition-tolerant web services. SIGACT News 33(2), 51–59 (2002)Golab, W., Rahman, M., Auyoung, A., Keeton, K., Li, X.: Eventually consistent: Not what you were expecting? ACM Queue, 12(1) (2014)Grant, J., Hunter, A.: Measuring inconsistency in knowledgebases. Journal of Intelligent Information Systems 27(2), 159–184 (2006)Gray, J., Lorie, R., Putzolu, G., Traiger, I.: Granularity of locks and degrees of consistency in a shared data base. In: Nijssen, G. (ed.) Modelling in Data Base Management Systems. North Holland (1976)Haerder, T., Reuter, A.: Principles of transaction-oriented database recovery. Computing Surveys 15(4), 287–317 (1983)Herlihy, M., Wing, J.: Linearizability: a correctness condition for concurrent objects. TOPLAS 12(3), 463–492 (1990)R. Ho. Design pattern for eventual consistency (2009). http://horicky.blogspot.com.es/2009/01/design-pattern-for-eventual-consistency.htmlIkeda, R., Park, H., Widom, J.: Provenance for generalized map and reduce workflows. In: CIDR (2011)Kempster, T., Stirling, C., Thanisch, P.: Diluting acid. SIGMoD Record 28(4), 17–23 (1999)Li, X., Dong, X.L., Meng, W., Srivastava, D.: Truth finding on the deep web: Is the problem solved? VLDB Endowment 6(2), 97–108 (2012)Lloyd, W., Freedman, M., Kaminsky, M., Andersen, D.: Don’t settle for eventual: scalable causal consistency for wide-area storage with cops. In: 23rd SOPS, pp. 401–416 (2011)Lomet, D.: Transactions: from local atomicity to atomicity in the cloud. In: Jones, C.B., Lloyd, J.L. (eds.) Dependable and Historic Computing. LNCS, vol. 6875, pp. 38–52. Springer, Heidelberg (2011)Monge, P., Contractor, N.: Theory of Communication Networks. Oxford University Press (2003)Nicolas, J.-M.: Logic for improving integrity checking in relational data bases. Acta Informatica 18, 227–253 (1982)Muñoz-Escoí, F.D., Irún, L., H. Decker: Database replication protocols. In: Encyclopedia of Database Technologies and Applications, pp. 153–157. IGI Global (2005)Oracle: Constraints. http://docs.oracle.com/cd/B19306_01/server.102/b14223/constra.htm (May 1, 2015)Ouzzani, M., Medjahed, B., Elmagarmid, A.: Correctness criteria beyond serializability. In: Encyclopedia of Database Systems, pp. 501–506. Springer (2009)Rosenkrantz, D., Stearns, R., Lewis, P.: Consistency and serializability in concurrent datanbase systems. SIAM J. Comput. 13(3), 508–530 (1984)Saito, Y., Shapiro, M.: Optimistic replication. JACM 37(1), 42–81 (2005)Sandhu, R.: On five definitions of data integrity. In: Proc. IFIP WG11.3 Workshop on Database Security, pp. 257–267. North-Holland (1994)Simmons, G.: Contemporary Cryptology: The Science of Information Integrity. IEEE Press (1992)Sivathanu, G., Wright, C., Zadok, E.: Ensuring data integrity in storage: techniques and applications. In: Proc. 12th Conf. on Computer and Communications Security, p. 26. ACM (2005)Svanks, M.: Integrity analysis: Methods for automating data quality assurance. Information and Software Technology 30(10), 595–605 (1988)Technet, M.: Data integrity. https://technet.microsoft.com/en-us/library/aa933058 (May 1, 2015)Terry, D.: Replicated data consistency explained through baseball. Technical report, Microsoft. MSR Technical Report (2011)Traiger, I., Gray, J., Galtieri, C., Lindsay, B.: Transactions and consistency in distributed database systems. ACM Trans. Database Syst. 7(3), 323–342 (1982)Vidyasankar, K.: Serializability. In: Encyclopedia of Database Systems, pp. 2626–2632. Springer (2009)Vogels, W.: Eventually consistent (2007). http://www.allthingsdistributed.com/2007/12/eventually_consistent.html . Other versions in ACM Queue 6(6), 14–19. http://queue.acm.org/detail.cfm?id=1466448 (2008) and CACM 52(1), 40–44 (2009)Wikipedia: Consistency model. http://en.wikipedia.org/wiki/Consistency_model (May 1, 2015)Wikipedia: Data integrity. http://en.wikipedia.org/wiki/Data_integrity (May 1, 2015)Wikipedia: Data quality. http://en.wikipedia.org/wiki/Data_quality (May 1, 2015)Yin, X., Han, J., Yu, P.: Truth discovery with multiple conflicting information providers on the web. IEEE Transactions of Knowledge and Data Engineering 20(6), 796–808 (2008)Young, G.: Quick thoughts on eventual consistency (2010). http://codebetter.com/gregyoung/2010/04/14/quick-thoughts-on-eventual-consistency/ (May 1, 2015

    Global Sequence Protocol: A Robust Abstraction for Replicated Shared State

    Get PDF
    In the age of cloud-connected mobile devices, users want responsive apps that read and write shared data everywhere, at all times, even if network connections are slow or unavailable. The solution is to replicate data and propagate updates asynchronously. Unfortunately, such mechanisms are notoriously difficult to understand, explain, and implement. To address these challenges, we present GSP (global sequence protocol), an operational model for replicated shared data. GSP is simple and abstract enough to serve as a mental reference model, and offers fine control over the asynchronous update propagation (update transactions, strong synchronization). It abstracts the data model and thus applies both to simple key-value stores, and complex structured data. We then show how to implement GSP robustly on a client-server architecture (masking silent client crashes, server crash-recovery failures, and arbitrary network failures) and efficiently (transmitting and storing minimal information by reducing update sequences)

    Eventual Consistent Databases: State of the Art

    Get PDF
    One of the challenges of cloud programming is to achieve the right balance between the availability and consistency in a distributed database. Cloud computing environments, particularly cloud databases, are rapidly increasing in importance, acceptance and usage in major applications, which need the partition-tolerance and availability for scalability purposes, but sacrifice the consistency side (CAP theorem). In these environments, the data accessed by users is stored in a highly available storage system, thus the use of paradigms such as eventual consistency became more widespread. In this paper, we review the state-of-the-art database systems using eventual consistency from both industry and research. Based on this review, we discuss the advantages and disadvantages of eventual consistency, and identify the future research challenges on the databases using eventual consistency

    Eventual Consistency: Origin and Support

    Get PDF
    Eventual consistency is demanded nowadays in geo-replicated services that need to be highly scalable and available. According to the CAP constraints, when network partitions may arise, a distributed service should choose between being strongly consistent or being highly available. Since scalable services should be available, a relaxed consistency (while the network is partitioned) is the preferred choice. Eventual consistency is not a common data-centric consistency model, but only a state convergence condition to be added to a relaxed consistency model. There are still several aspects of eventual consistency that have not been analysed in depth in previous works: 1. which are the oldest replication proposals providing eventual consistency, 2. which replica consistency models provide the best basis for building eventually consistent services, 3. which mechanisms should be considered for implementing an eventually consistent service, and 4. which are the best combinations of those mechanisms for achieving different concrete goals. This paper provides some notes on these important topics

    Algorithmique distribuée asynchrone avec une majorité de pannes

    Get PDF
    In distributed computing, asynchronous message-passing model with crashes is well-known and considered in many articles, because of its realism and it issimple enough to be used and complex enough to represent many real problems.In this model, n processes communicate by exchanging messages, but withoutany bound on communication delays, i.e. a message may take an arbitrarilylong time to reach its destination. Moreover, up to f among the n processesmay crash, and thus definitely stop working. Those crashes are undetectablebecause of the system asynchronism, and restrict the potential results in thismodel.In many cases, known results in those systems must verify the propertyof a strict minority of crashes. For example, this applies to implementationof atomic registers and solving of renaming. This barrier of a majority ofcrashes, explained by the CAP theorem, restricts numerous problems, and theasynchronous message-passing model with a majority of crashes is thus notwell-studied and rather unknown. Hence, studying what can be done in thiscase of a majority of crashes is interesting.This thesis tries to analyse this model, through two main problems. The first part studies the implementation of shared objects, similar to usual registers,by defining x-colored register banks, and α-registers. The second partextends the renaming problem into k-redundant renaming, for both one-shotand long-lived versions, and similarly for the shared objects called splitters intok-splitters.En algorithmique distribuée, le modèle asynchrone par envoi de messages et à pannes est connu et utilisé dans de nombreux articles de par son réalisme,par ailleurs il est suffisamment simple pour être utilisé et suffisamment complexe pour représenter des problèmes réels. Dans ce modèle, les n processus communiquent en s'échangeant des messages, mais sans borne sur les délais de communication, c'est-à-dire qu'un message peut mettre un temps arbitrairement long à atteindre sa destination. De plus, jusqu'à f processus peuvent tomber en panne, et ainsi arrêter définitivement de fonctionner. Ces pannes indétectables à cause de l'asynchronisme du système limitent les possibilités de ce modèle. Dans de nombreux cas, les résultats connus dans ces systèmes sont limités à une stricte minorité de pannes. C'est par exemple le cas de l'implémentation de registres atomiques et de la résolution du renommage. Cette barrière de la majorité de pannes, expliquée par le théorème CAP, s'applique à de nombreux problèmes, et fait que le modèle asynchrone par envoi de messages avec une majorité de pannes est peu étudié. Il est donc intéressant d'étudier ce qu'il est possible de faire dans ce cadre.Cette thèse cherche donc à mieux comprendre ce modèle à majorité de pannes, au travers de deux principaux problèmes. Dans un premier temps, on étudie l'implémentation d'objets partagés similaires aux registres habituels, en définissant les bancs de registres x-colorés et les α-registres. Dans un second temps, le problème du renommage est étendu en renommage k-redondant, dans ses versions à-un-coup et réutilisable, et de même pour les objets partagés diviseurs, étendus en k-diviseurs
    corecore