1,221 research outputs found

    Automatic Migration of Data to NoSQL Databases Using Service Oriented Architecture

    Get PDF
    For the past few years there has been an exponential rise in the use of databases which are not true relational databases. There is no correct definition of such databases but can only be described with a set of common characteristics such absence of a fixed schema, inherent scalability features, high performance, data etc. These databases have come to be known as NoSQL databases. Various companies are seeing the advantages of NoSQL and want to migrate to these databases. But they find it difficult to migrate their data as a lot of study and analysis is required. Each type of database have their own terminology and query language. We propose a novel automated migration model which utilizes the power of service oriented architecture to help these companies easily migrate to NoSQL databases of their choice. We utilize web services which encapsulates few of the most popular NoSQL databases such as MongoDB, Neo4j, Cassandra etc. so that inner details of these databases are hidden yet providing efficient migration of data with little or no knowledge of the inner working of these databases. As proof of concept relational data was migrated successfully from Apache Derby database to MongoDB, Cassandra, Neo4j and DynamoDB, each vendor representing a different type of NoSQL database

    Information Outlook, May 1997

    Get PDF
    Volume 1, Issue 5https://scholarworks.sjsu.edu/sla_io_1997/1004/thumbnail.jp

    Middleware support for locality-aware wide area replication

    Get PDF
    technical reportCoherent wide-area data caching can improve the scalability and responsiveness of distributed services such as wide-area file access, database and directory services, and content distribution. However, distributed services differ widely in the frequency of read/write sharing, the amount of contention between clients for the same data, and their ability to make tradeoffs between consistency and availability. Aggressive replication enhances the scalability and availability of services with read-mostly data or data that need not be kept strongly consistent. However, for applications that require strong consistency of writeshared data, you must throttle replication to achieve reasonable performance. We have developed a middleware data store called Swarm designed to support the widearea data sharing needs of distributed services. To support the needs of diverse distributed services, Swarm provides: (i) a failure-resilient proximity-aware data replication mechanism that adjusts the replication hierarchy based on observed network characteristics and node availability, (ii) a customizable consistency mechanism that allows applications to specify allowable consistency-availability tradeoffs, and (iii) a contention-aware caching mechanism that monitors contention between replicas and adjusts its replication policies accordingly. On a 240-node P2P file sharing system, Swarm's proximity-aware caching and replica hierarchy maintenance mechanisms improve latency by 80%, reduce WAN bandwidth consumed by 80%, and limit the impact of high node churn (5 node deaths/sec) to roughly one-fifth that of random replication. In addition, Swarm's contention-aware caching mechanism outperforms RPCs and static caching mechanisms at all levels of contention on an enterprise service workload

    Khazana: a flexible wide area data store

    Get PDF
    technical reportKhazana is a peer-to-peer data service that supports efficient sharing and aggressive caching of mutable data across the wide area while giving clients significant control over replica divergence. Previous work on wide-area replicated services focussed on at most two of the following three properties: aggressive replication, customizable consistency, and generality. In contrast, Khazana provides scalable support for large numbers of replicas while giving applications considerable flexibility in trading off consistency for availability and performance. Its flexibility enables applications to effectively exploit inherent data locality while meeting consistency needs. Khazana exports a file system-like interface with a small set of consistency controls which can be combined to yield a broad spectrum of consistency flavors ranging from strong consistency to best-effort eventual consistency. Khazana servers form failure-resilient dynamic replica hierarchies to manage replicas across variable quality network links. In this report, we outline Khazana?s design and show how its flexibility enables three diverse network services built on top of it to meet their individual consistency and performance needs: (i) a wide-area replicated file system that supports serializable writes as well as traditional file sharing across wide area, (ii) an enterprise data service that exploits locality by caching enterprise data closer to end-users while ensuring strong consistency for data integrity, and (iii) a replicated database that reaps order of magnitude gains in throughput by relaxing consistency

    Inductive Verification of Data Model Invariants for Web Applications ∗

    Get PDF
    Modern software applications store their data in remote cloud servers. Users interact with these applications using web browsers or thin clients running on mobile devices. A key issue in dependability of these applications is the correctness of the actions that update the data store, which are triggered by user requests. In this paper, we present techniques for automatically checking if the actions of an application preserve the data model invariants. Our approach first automatically data store, from a given application using instrumented execution. The abstract data store identifies the sets of objects and relations (associations) used by the application, and the actions that update the data store by deleting or creating objects or by changing the relations among the objects. We show that checking invariants of an abstract data store corresponds to inductive invariant verification, and can be done using a mapping to First Order Logic (FOL) and using a FOL theorem prover. We implemented this approach for the Rails framework and applied it to three open source applications. We found four previously unknown bugs and reported them to the developers, who confirmed and immediately fixed two of them

    Radical Agent-based Approach for Intelligence Analysis

    Get PDF
    This paper presents a novel agent-based framework as a decision aid tool for intelligence analysis. This technology extends net-centric information processing and abstraction as well as fusion and multi-source integration strategies. Our information agents traverse and mediate disparate ontologies in different formats providing a foundation for semantic interoperability. The presented system provides knowledge discovery by accessing a large number of information sources in a particular domain and organizing them into a network of information agents. Each agent provides expertise on a specific topic by drawing on relevant information from other information agents in related knowledge domains. Unique advantages include net-centric scalability, principled information assurance, as well as ground breaking knowledge discovery in service of intelligence analysis

    A Model for Managing Information Flow on the World Wide Web

    Get PDF
    Metadata merged with duplicate record (http://hdl.handle.net/10026.1/330) on 20.12.2016 by CS (TIS).This is a digitised version of a thesis that was deposited in the University Library. If you are the author please contact PEARL Admin ([email protected]) to discuss options.This thesis considers the nature of information management on the World Wide Web. The web has evolved into a global information system that is completely unregulated, permitting anyone to publish whatever information they wish. However, this information is almost entirely unmanaged, which, together with the enormous number of users who access it, places enormous strain on the web's architecture. This has led to the exposure of inherent flaws, which reduce its effectiveness as an information system. The thesis presents a thorough analysis of the state of this architecture, and identifies three flaws that could render the web unusable: link rot; a shrinking namespace; and the inevitable increase of noise in the system. A critical examination of existing solutions to these flaws is provided, together with a discussion on why the solutions have not been deployed or adopted. The thesis determines that they have failed to take into account the nature of the information flow between information provider and consumer, or the open philosophy of the web. The overall aim of the research has therefore been to design a new solution to these flaws in the web, based on a greater understanding of the nature of the information that flows upon it. The realization of this objective has included the development of a new model for managing information flow on the web, which is used to develop a solution to the flaws. The solution comprises three new additions to the web's architecture: a temporal referencing scheme; an Oracle Server Network for more effective web browsing; and a Resource Locator Service, which provides automatic transparent resource migration. The thesis describes their design and operation, and presents the concept of the Request Router, which provides a new way of integrating such distributed systems into the web's existing architecture without breaking it. The design of the Resource Locator Service, including the development of new protocols for resource migration, is covered in great detail, and a prototype system that has been developed to prove the effectiveness of the design is presented. The design is further validated by comprehensive performance measurements of the prototype, which show that it will scale to manage a web whose size is orders of magnitude greater than it is today

    Automatic Data Migration into the Cloud

    Get PDF
    Relational databases have been used for decades to store data. Using scale up, relational databases require a bigger and bigger server with more CPUs, more memory, and more disk storage to keep all the tables to support more concurrent users. However, big servers tend to be highly complex, proprietary, and disproportionately expensive, unlike the low-cost, commodity hardware. Therefore, it becomes important to store data efficiently and compute with massive amount of data, providing high scalability, providing high performance and availability at low costs. This leads to the invention of cloud databases, for instance NoSQL databases. NoSQL databases have many advantages such as reading and writing data quickly, supporting massive storage and low cost. The scaling approach in cloud databases is scale out, which is used to add multiple servers, and the data structure of storage is in the form of key-value pairs. However, it can be a challenge for enterprises to migrate existing relational databases to highly scalable NoSQL databases on clouds. In this thesis, we propose an automatic data migration model which will assist enterprises to migrate their relational databases efficiently and transparently to the cloud databases. We propose four migration methods to migrate data in four different ways. Each migration method is independent of the others and stores the migrated relational database in different formats in the cloud database. We design a system to implement the automatic data migration model. As a proof of concept, we successfully migrated a relational database from Microsoft SQL Server to a cloud database Amazon SimpleDB using four different migration methods. Furthermore, we have conducted extensive experiments on Amazon SimpleDB to evaluate the performance of our model in terms of computational time, storage cost, sharding and redundancy. Based on these experiments and detailed analysis of each migration method, our system allows enterprises to determine which method is suitable for their data migration. Furthermore, our experimental evaluation shows that our solution is promising and can migrate data from the relational databases to the cloud databases
    corecore