78,954 research outputs found

    Flexible cooperation in non-standard application environments

    Get PDF
    The integration of preexisting systems into a single, heterogeneous, distributed non-standard application system in domains like office automation or computer-integrated manufacturing are regarded as cooperating systems. They are characterized through teamwork, distribution and the handling of complex data structures (e.g. multimedia data). Object-oriented database systems, providing for complex object management, represent one approach in support of such applications. They concentrate, however, on data modeling aspects and use more or less conventional transaction concepts, based on a global execution control. Hence, they only partially fulfill application requirements as they do not adequately cope with the autonomy that is often inherent to the system's components. As a consequence, we suggest S-transactions as an appropriate means for describing the cooperation of system components in terms of transactions and beyond. In this paper we outline the modeling of conventional transactions (flat or nested as well as distributed and design transactions) in terms of STDL, the S-transaction definition language. Beyond that we point out how to specify SAGAs and similar concepts. Finally we discuss the specification of non-linear but maybe acyclic or even cyclic cooperation structuresPrepared for: Naval Ocean Systems Center and funded by the Naval Postgraduate School.http://archive.org/details/flexiblecooperat00holtO&MN, Direct FundingNAApproved for public release; distribution is unlimited

    Schema architecture and their relationships to transaction processing in distributed database systems

    Get PDF
    We discuss the different types of schema architectures which could be supported by distributed database systems, making a clear distinction between logical, physical, and federated distribution. We elaborate on the additional mapping information required in architecture based on logical distribution in order to support retrieval as well as update operations. We illustrate the problems in schema integration and data integration in multidatabase systems and discuss their impact on query processing. Finally, we discuss different issues relevant to the cooperation (or noncooperation) of local database systems in a heterogeneous multidatabase system and their relationship to the schema architecture and transaction processing

    Storage Solutions for Big Data Systems: A Qualitative Study and Comparison

    Full text link
    Big data systems development is full of challenges in view of the variety of application areas and domains that this technology promises to serve. Typically, fundamental design decisions involved in big data systems design include choosing appropriate storage and computing infrastructures. In this age of heterogeneous systems that integrate different technologies for optimized solution to a specific real world problem, big data system are not an exception to any such rule. As far as the storage aspect of any big data system is concerned, the primary facet in this regard is a storage infrastructure and NoSQL seems to be the right technology that fulfills its requirements. However, every big data application has variable data characteristics and thus, the corresponding data fits into a different data model. This paper presents feature and use case analysis and comparison of the four main data models namely document oriented, key value, graph and wide column. Moreover, a feature analysis of 80 NoSQL solutions has been provided, elaborating on the criteria and points that a developer must consider while making a possible choice. Typically, big data storage needs to communicate with the execution engine and other processing and visualization technologies to create a comprehensive solution. This brings forth second facet of big data storage, big data file formats, into picture. The second half of the research paper compares the advantages, shortcomings and possible use cases of available big data file formats for Hadoop, which is the foundation for most big data computing technologies. Decentralized storage and blockchain are seen as the next generation of big data storage and its challenges and future prospects have also been discussed

    Towards a Novel Cooperative Logistics Information System Framework

    Get PDF
    Supply Chains and Logistics have a growing importance in global economy. Supply Chain Information Systems over the world are heterogeneous and each one can both produce and receive massive amounts of structured and unstructured data in real-time, which are usually generated by information systems, connected objects or manually by humans. This heterogeneity is due to Logistics Information Systems components and processes that are developed by different modelling methods and running on many platforms; hence, decision making process is difficult in such multi-actor environment. In this paper we identify some current challenges and integration issues between separately designed Logistics Information Systems (LIS), and we propose a Distributed Cooperative Logistics Platform (DCLP) framework based on NoSQL, which facilitates real-time cooperation between stakeholders and improves decision making process in a multi-actor environment. We included also a case study of Hospital Supply Chain (HSC), and a brief discussion on perspectives and future scope of work

    Protocols for Integrity Constraint Checking in Federated Databases

    Get PDF
    A federated database is comprised of multiple interconnected database systems that primarily operate independently but cooperate to a certain extent. Global integrity constraints can be very useful in federated databases, but the lack of global queries, global transaction mechanisms, and global concurrency control renders traditional constraint management techniques inapplicable. This paper presents a threefold contribution to integrity constraint checking in federated databases: (1) The problem of constraint checking in a federated database environment is clearly formulated. (2) A family of protocols for constraint checking is presented. (3) The differences across protocols in the family are analyzed with respect to system requirements, properties guaranteed by the protocols, and processing and communication costs. Thus, our work yields a suite of options from which a protocol can be chosen to suit the system capabilities and integrity requirements of a particular federated database environment

    Middleware-based Database Replication: The Gaps between Theory and Practice

    Get PDF
    The need for high availability and performance in data management systems has been fueling a long running interest in database replication from both academia and industry. However, academic groups often attack replication problems in isolation, overlooking the need for completeness in their solutions, while commercial teams take a holistic approach that often misses opportunities for fundamental innovation. This has created over time a gap between academic research and industrial practice. This paper aims to characterize the gap along three axes: performance, availability, and administration. We build on our own experience developing and deploying replication systems in commercial and academic settings, as well as on a large body of prior related work. We sift through representative examples from the last decade of open-source, academic, and commercial database replication systems and combine this material with case studies from real systems deployed at Fortune 500 customers. We propose two agendas, one for academic research and one for industrial R&D, which we believe can bridge the gap within 5-10 years. This way, we hope to both motivate and help researchers in making the theory and practice of middleware-based database replication more relevant to each other.Comment: 14 pages. Appears in Proc. ACM SIGMOD International Conference on Management of Data, Vancouver, Canada, June 200

    Observations on Factors Affecting Performance of MapReduce based Apriori on Hadoop Cluster

    Full text link
    Designing fast and scalable algorithm for mining frequent itemsets is always being a most eminent and promising problem of data mining. Apriori is one of the most broadly used and popular algorithm of frequent itemset mining. Designing efficient algorithms on MapReduce framework to process and analyze big datasets is contemporary research nowadays. In this paper, we have focused on the performance of MapReduce based Apriori on homogeneous as well as on heterogeneous Hadoop cluster. We have investigated a number of factors that significantly affects the execution time of MapReduce based Apriori running on homogeneous and heterogeneous Hadoop Cluster. Factors are specific to both algorithmic and non-algorithmic improvements. Considered factors specific to algorithmic improvements are filtered transactions and data structures. Experimental results show that how an appropriate data structure and filtered transactions technique drastically reduce the execution time. The non-algorithmic factors include speculative execution, nodes with poor performance, data locality & distribution of data blocks, and parallelism control with input split size. We have applied strategies against these factors and fine tuned the relevant parameters in our particular application. Experimental results show that if cluster specific parameters are taken care of then there is a significant reduction in execution time. Also we have discussed the issues regarding MapReduce implementation of Apriori which may significantly influence the performance.Comment: 8 pages, 8 figures, International Conference on Computing, Communication and Automation (ICCCA2016

    Integrity Constraint Checking in Federated Databases

    Get PDF
    A federated database is comprised of multiple interconnected databases that cooperate in an autonomous fashion. Global integrity constraints are very useful in federated databases, but the lack of global queries, global transaction mechanisms, and global concurrency control renders traditional constraint management techniques inapplicable. The paper presents a threefold contribution to integrity constraint checking in federated databases: (1) the problem of constraint checking in a federated database environment is clearly formulated; (2) a family of cooperative protocols for constraint checking is presented; (3) the differences across protocols in the family are analyzed with respect to system requirements, properties guaranteed, and costs involved. Thus, we provide a suite of options with protocols for various environments with specific system capabilities and integrity requirement
    corecore