822 research outputs found

    New Directions In Database-Systems Research and Development

    Get PDF
    Prepared for: Chief of Naval Research Arlington, VA 22217In this paper, three new directions in database-systems research and development are indicated. One new direction is the emergence of the multilingual database systems where a single database system can execute many transactions written respectively in different data languages and support many databases structured correspondingly in various data models. Thus, a multi-lingual database system allows the old transactions and existing databases to be migrated to the new system, the user to explore the strong features of the various data languages and data models in the same system, the hardware upgrade to be focused on a single system instead of a heterogeneous collection of database systems, and the database application to cover wider types of transactions and interaction in the same environment. One other new direction is the emphasis of the multi-backend database systems where the database system is configured with a number of microprocessor-based processing units and their disk subsystems. These processing units and disk subsystems are called database backends. The unique characteristics of the backends are that the number of the backends is variable, the system software in all of the backends is identical, and the multiplicity of the backends is proportional to the performance and capacity of the system. Thus, for the first time, a multi-backend database system enables the user to relate the amount of hardware used (i.e., the number of the backends) to the degree of performance gain and capacity growth of the system. The third new direction is the possibility of the multi-host database systems where a single database system can communicate with a variable number and heterogeneous collection of mainframes in several different data languages and allow the mainframes to share the common database store and access. This paper attempts to articulate the background, benefits, requirements and architectures of these new types of database system, namely, the multi-lingua the multi-backend, and the multi-host database systems.DoD STARS Program and from the Office of Naval Research.Approved for public release; distribution is unlimited

    Design and Performance analysis of a relational replicated database systems

    Get PDF
    The hardware organization and software structure of a new database system are presented. This system, the relational replicated database system (RRDS), is based on a set of replicated processors operating on a partitioned database. Performance improvements and capacity growth can be obtained by adding more processors to the configuration. Based on designing goals a set of hardware and software design questions were developed. The system then evolved according to a five-phase process, based on simulation and analysis, which addressed and resolved the design questions. Strategies and algorithms were developed for data access, data placement, and directory management for the hardware organization. A predictive performance analysis was conducted to determine the extent to which original design goals were satisfied. The predictive performance results, along with an analytical comparison with three other relational multi-backend systems, provided information about the strengths and weaknesses of our design as well as a basis for future research

    Storage and Analysis of Big Data Tools for Sessionized Data

    Get PDF
    The Oracle database currently used to mine data at PEGGY is approaching end-of-life and a new infrastructure overhaul is required. It has also been identified that a critical business requirement is the need to load and store very large historical data sets. These data sets contain raw electronic consumer events and interactions from a website such as page views, clicks, downloads, return visits, length of time spent on pages, and how they got to the site / originated. This project will be focused on finding a tool to analyze and measure sessionized data, which is a unit of measurement in web analytics that captures either a user\u27s actions within a particular time period, or the process of segmenting user activity of each user into sessions, each representing a single visit to the site. This sessionized data can be used as the input for a variety of data mining tasks such as clustering, association rule mining, sequence mining etc (Ansari. 2011) This sessionized data must be delivered in a reorganized and readable format timely enough to make informed go-to-market decisions as it relates to the current and existing industry trends. It is also pertinent to understand any development work required and the burden on the resources. Legacy on-premise data warehouse solutions are becoming more expensive, less efficient, less dynamic, and unscalable when compared to current Cloud Infrastructure as a Service (IaaS) that offer real time, on-demand, pay-as-you-go solutions . Therefore, this study will examine the total cost of ownership (TCO) by considering, researching, and analyzing the following factors against a system wide upgrade of the current on-premise Oracle Real Application Cluster (RAC) System: High performance: real-time (or as close to as possible) query speed against sessionized data SQL compliance Cloud based or, at least a hybrid (read: on-premise paired with cloud) Security: encryption preferred Cost structure: cost-effective pay-as-you-go pricing model and resources required for the migration and operations. These technologies analyzed against the current Oracle database are: Amazon Redshift Google Bigquery Hadoop Hadoop + Hive The cost of building an on-premise data warehouse is substantial. The project will determine the performance capabilities and affordability of Amazon Redshift, when compared to other emerging highly ranked solutions, for running e-commerce standard analytics queries on terabytes of sessionized data. Rather than redesigning, upgrading, or over purchasing infrastructure at a high cost for an on-premise data warehouse, this project considers data warehousing solutions through cloud based infrastructure as a service (IaaS) solutions. The proposed objective of this project is to determine the most cost-effective high performer between Amazon Redshift, Apache Hadoop, and Google BigQuery when running e-commerce standard analytics queries on terabytes of sessionized data

    Use of locator/identifier separation to improve the future internet routing system

    Get PDF
    The Internet evolved from its early days of being a small research network to become a critical infrastructure many organizations and individuals rely on. One dimension of this evolution is the continuous growth of the number of participants in the network, far beyond what the initial designers had in mind. While it does work today, it is widely believed that the current design of the global routing system cannot scale to accommodate future challenges. In 2006 an Internet Architecture Board (IAB) workshop was held to develop a shared understanding of the Internet routing system scalability issues faced by the large backbone operators. The participants documented in RFC 4984 their belief that "routing scalability is the most important problem facing the Internet today and must be solved." A potential solution to the routing scalability problem is ending the semantic overloading of Internet addresses, by separating node location from identity. Several proposals exist to apply this idea to current Internet addressing, among which the Locator/Identifier Separation Protocol (LISP) is the only one already being shipped in production routers. Separating locators from identifiers results in another level of indirection, and introduces a new problem: how to determine location, when the identity is known. The first part of our work analyzes existing proposals for systems that map identifiers to locators and proposes an alternative system, within the LISP ecosystem. We created a large-scale Internet topology simulator and used it to compare the performance of three mapping systems: LISP-DHT, LISP+ALT and the proposed LISP-TREE. We analyzed and contrasted their architectural properties as well. The monitoring projects that supplied Internet routing table growth data over a large timespan inspired us to create LISPmon, a monitoring platform aimed at collecting, storing and presenting data gathered from the LISP pilot network, early in the deployment of the LISP protocol. The project web site and collected data is publicly available and will assist researchers in studying the evolution of the LISP mapping system. We also document how the newly introduced LISP network elements fit into the current Internet, advantages and disadvantages of different deployment options, and how the proposed transition mechanism scenarios could affect the evolution of the global routing system. This work is currently available as an active Internet Engineering Task Force (IETF) Internet Draft. The second part looks at the problem of efficient one-to-many communications, assuming a routing system that implements the above mentioned locator/identifier split paradigm. We propose a network layer protocol for efficient live streaming. It is incrementally deployable, with changes required only in the same border routers that should be upgraded to support locator/identifier separation. Our proof-of-concept Linux kernel implementation shows the feasibility of the protocol, and our comparison to popular peer-to-peer live streaming systems indicates important savings in inter-domain traffic. We believe LISP has considerable potential of getting adopted, and an important aspect of this work is how it might contribute towards a better mapping system design, by showing the weaknesses of current favorites and proposing alternatives. The presented results are an important step forward in addressing the routing scalability problem described in RFC 4984, and improving the delivery of live streaming video over the Internet

    When to Utilize Software as a Service

    Get PDF
    Cloud computing enables on-demand network access to shared resources (e.g., computation, networks, servers, storage, applications, and services) that can be rapidly provisioned and released with minimal management effort. Cloud computing refers to both the applications delivered as services over the Internet and the hardware and system software in the data centers. Software as a service (SaaS) is part of cloud computing. It is one of the cloud service models. SaaS is software deployed as a hosted service and accessed over the Internet. In SaaS, the consumer uses the provider‘s applications running in the cloud. SaaS separates the possession and ownership of software from its use. The applications can be accessed from any device through a thin client interface. A typical SaaS application is used with a web browser based on monthly pricing. In this thesis, the characteristics of cloud computing and SaaS are presented. Also, a few implementation platforms for SaaS are discussed. Then, four different SaaS implementation cases and one transformation case are deliberated. The pros and cons of SaaS are studied. This is done based on literature references and analysis of the SaaS implementations and the transformation case. The analysis is done both from the customer‘s and service provider‘s point of view. In addition, the pros and cons of on-premises software are listed. The purpose of this thesis is to find when SaaS should be utilized and when it is better to choose a traditional on-premises software. The qualities of SaaS bring many benefits both for the customer as well as the provider. A customer should utilize SaaS when it provides cost savings, ease, and scalability over on-premises software. SaaS is reasonable when the customer does not need tailoring, but he only needs a simple, general-purpose service, and the application supports customer‘s core business. A provider should utilize SaaS when it offers cost savings, scalability, faster development, and wider customer base over on-premises software. It is wise to choose SaaS when the application is cheap, aimed at mass market, needs frequent updating, needs high performance computing, needs storing large amounts of data, or there is some other direct value from the cloud infrastructure.Siirretty Doriast

    A livestock information system roadmap for Ethiopia

    Get PDF
    Agriculture is one of the pillars of the Ethiopian economy and the overall economic growth of the country is highly dependent on the success of the agricultural sector. Livestock is an integral part of the agricultural sector and the contribution of live animals and their products to the agricultural economy is immense. Livestock production plays a substantial role in Ethiopia through the provision of food, income, employment and many other contributions. “A Livestock Information System Roadmap for Ethiopia” is a guidance document that sets out a path for the development of a livestock information system for Ethiopia. It encompasses discrete steps covering system development, resource and capability requirements, and governance implementation that can be followed to produce a bespoke information system. This system is a key component of the Ethiopian digital strategy and will accelerate progress on the delivery of Ethiopia’s strategic plan for agriculture (Ten-Year Strategic Development Plan, 2021). The Livestock Information System Roadmap is the outcome of a collaboration between Livestock Improvement Corporation (LIC)(NZ), Ministry of Agriculture (MoA), Alliance of Bioversity International and the International Center for Tropical Agriculture (CIAT), and the Bill & Melinda Gates Foundation (BMGF)

    openstack

    Get PDF
    Σκοπός της Διπλωματικής εργασίας είναι η παρουσίαση του OpenStack. Ένα ανοιχτό λογισμικό διαχείρισης των τηλεπικοινωνιακών πόρων σε cloud περιβάλλον. Για την εκπόνηση της Διπλωματικής εργασίας γίνεται αρχικά μία περιγραφή της αρχιτεκτονικής του cloud περιβάλλοντος, και των μοντέλων εξυπηρέτησης που χρησιμοποιούνται. Εν συνεχεία παρουσιάζεται η αρχιτεκτονική Network Function Virtualization που εφαρμόζεται στις τηλεπικοινωνίες σύμφωνα με τα πρότυπα που έχει θέσει ο Ευρωπα’ι’κός Οργανισμός Τηλεπικοινωνιακών Προτύπων. Το κύριο θέμα της Διπλωματικής Εργασίας είναι η παρουσίαση του λογισμικού OpenStack που χρησιμοποιείται από την NFV αρχιτεκτονική. Στα κεφάλαια αυτά γίνεται μία προσπάθεια όσο το δυνατόν λεπτομερέστερης και πληρέστερης περιγραφής των λειτουργιών του OpenStack καθώς και τα μέρη από τα αποία αποτελείται. Τέλος γίνεται μία τεχνοοικονομική ανάλυση του κόστους εφαρμογής της NFV αρχιτεκτονικής με την τωρινή αρχιτεκτονική που εφαρμόζεται στα Τηλεπιοκοινωνιακά δίκτυα. Τα αποτελέσματα τα οποία προκύπτουν από την παρούσα εργασία είναι η πολλές δυνατότητες υλοπόιησης και εφαρμογής της νέας αρχιτεκτονικής καθώς και το πολύ χαμηλό κόστος λειτουργίας της σε σχέση με την υφιστάμενη εώς τώρα τεχνολογίαThe aim of this thesis is the presentation of OpenStack. An open software management of telecommunications resources in cloud environment. For the preparation of this thesis is first a description of the architecture of cloud environment, and service models used. Architecture Network Function Virtualization occurs then applied to telecommunications in accordance with standards set by the European Telecommunications Standards Institute. The main topic of the thesis is to present the OpenStack software used by the NFV architecture. In these chapters an attempt is as detailed and comprehensive description of the OpenStack functions and parts of the colony consists. Finally there is one techno-economic analysis of the cost of implementing NFV architecture with the current architecture applied to Telecommunication networks. The results derived from this work is a lot of potential implementation and application of the new architecture and the very low operating costs compared with existing technology up to no

    Applications Know Best: Performance-Driven Memory Overcommit with Ginkgo

    Full text link
    Abstract—Memory overcommitment enables cloud providers to host more virtual machines on a single physical server, exploiting spare CPU and I/O capacity when physical memory becomes the bottleneck for virtual machine deployment. However, overcommiting memory can also cause noticeable application performance degradation. We present Ginkgo, a policy frame-work for overcomitting memory in an informed and automated fashion. By directly correlating application-level performance to memory, Ginkgo automates the redistribution of scarce memory across all virtual machines, satisfying performance and capacity constraints. Ginkgo also achieves memory gains for traditionally fixed-size Java applications by coordinating the redistribution of available memory with the activities of the Java Virtual Machine heap. When compared to a non-overcommited system, Ginkgo runs the DayTrader 2.0 and SPECWeb 2009 benchmarks with the same number of virtual machines while saving up to 73% (50 % omitting free space) of a physical server’s memory while keeping application performance degradation within 7%. I
    corecore