15 research outputs found

    A comparative analysis of leading relational database management systems

    Full text link
    http://deepblue.lib.umich.edu/bitstream/2027.42/96903/1/MBA_JayaramanS_1996Final.pd

    High Availability and Scalability of Mainframe Environments using System z and z/OS as example

    Get PDF
    Mainframe computers are the backbone of industrial and commercial computing, hosting the most relevant and critical data of businesses. One of the most important mainframe environments is IBM System z with the operating system z/OS. This book introduces mainframe technology of System z and z/OS with respect to high availability and scalability. It highlights their presence on different levels within the hardware and software stack to satisfy the needs for large IT organizations

    Database Principles and Technologies – Based on Huawei GaussDB

    Get PDF
    This open access book contains eight chapters that deal with database technologies, including the development history of database, database fundamentals, introduction to SQL syntax, classification of SQL syntax, database security fundamentals, database development environment, database design fundamentals, and the application of Huawei’s cloud database product GaussDB database. This book can be used as a textbook for database courses in colleges and universities, and is also suitable as a reference book for the HCIA-GaussDB V1.5 certification examination. The Huawei GaussDB (for MySQL) used in the book is a Huawei cloud-based high-performance, highly applicable relational database that fully supports the syntax and functionality of the open source database MySQL. All the experiments in this book can be run on this database platform. As the world’s leading provider of ICT (information and communication technology) infrastructure and smart terminals, Huawei’s products range from digital data communication, cyber security, wireless technology, data storage, cloud computing, and smart computing to artificial intelligence

    Database Principles and Technologies – Based on Huawei GaussDB

    Get PDF
    This open access book contains eight chapters that deal with database technologies, including the development history of database, database fundamentals, introduction to SQL syntax, classification of SQL syntax, database security fundamentals, database development environment, database design fundamentals, and the application of Huawei’s cloud database product GaussDB database. This book can be used as a textbook for database courses in colleges and universities, and is also suitable as a reference book for the HCIA-GaussDB V1.5 certification examination. The Huawei GaussDB (for MySQL) used in the book is a Huawei cloud-based high-performance, highly applicable relational database that fully supports the syntax and functionality of the open source database MySQL. All the experiments in this book can be run on this database platform. As the world’s leading provider of ICT (information and communication technology) infrastructure and smart terminals, Huawei’s products range from digital data communication, cyber security, wireless technology, data storage, cloud computing, and smart computing to artificial intelligence

    Puredata Systems for Analytics: Concurrency and Workload Management

    Get PDF
    PureDataTM System for Analytics also called as Netezza is a data warehouse server handling analytic operations capable of providing throughput 1000 times greater and faster than traditional database servers. Impressively, it requires minimal system tuning thereby providing high-end performance as well as requiring a low total cost of ownership (TCO). Database performance is directly linked to the allocation of system resources on a database management system. The heart of the Netezza appliance, Field-Programmable Gate Array (FPGA) plays a key role in boosting the overall performance of a server. I/O operations are always a bottleneck in any database server and it is the FPGA that eradicates the I/O problem in Netezza by filtering the data across each snippet processing unit (SPU), processing and running the queries faster thereby pumping server’s performance greatly. This paper describes the current problems the companies face in a “big data” environment which includes concurrency handling and query performance. There are various factors which affect a query\u27s performance, which includes bad data distribution, stale statistics, server load and uneven system resources. Since this paper is restricted to only the system resources, an in-depth analysis of system resources and its components will be analyzed in this research. A database server’s performance is directly related to its underlying allocation of system resources. Work Load Management (WLM) and each of its features are described in this paper which gives the reader a clear notion of how a query\u27s performance is altered using various mechanisms. The paper describes the current performance problems that exist on the traditional database servers and how the Work Load Management components can be tweaked along with the predefined system configurations to process a query to run faster on a Netezza machine

    A shared-disk parallel cluster file system

    Get PDF
    Dissertação apresentada para obtenção do Grau de Doutor em Informática Pela Universidade Nova de Lisboa, Faculdade de Ciências e TecnologiaToday, clusters are the de facto cost effective platform both for high performance computing (HPC) as well as IT environments. HPC and IT are quite different environments and differences include, among others, their choices on file systems and storage: HPC favours parallel file systems geared towards maximum I/O bandwidth, but which are not fully POSIX-compliant and were devised to run on top of (fault prone) partitioned storage; conversely, IT data centres favour both external disk arrays (to provide highly available storage) and POSIX compliant file systems, (either general purpose or shared-disk cluster file systems, CFSs). These specialised file systems do perform very well in their target environments provided that applications do not require some lateral features, e.g., no file locking on parallel file systems, and no high performance writes over cluster-wide shared files on CFSs. In brief, we can say that none of the above approaches solves the problem of providing high levels of reliability and performance to both worlds. Our pCFS proposal makes a contribution to change this situation: the rationale is to take advantage on the best of both – the reliability of cluster file systems and the high performance of parallel file systems. We don’t claim to provide the absolute best of each, but we aim at full POSIX compliance, a rich feature set, and levels of reliability and performance good enough for broad usage – e.g., traditional as well as HPC applications, support of clustered DBMS engines that may run over regular files, and video streaming. pCFS’ main ideas include: · Cooperative caching, a technique that has been used in file systems for distributed disks but, as far as we know, was never used either in SAN based cluster file systems or in parallel file systems. As a result, pCFS may use all infrastructures (LAN and SAN) to move data. · Fine-grain locking, whereby processes running across distinct nodes may define nonoverlapping byte-range regions in a file (instead of the whole file) and access them in parallel, reading and writing over those regions at the infrastructure’s full speed (provided that no major metadata changes are required). A prototype was built on top of GFS (a Red Hat shared disk CFS): GFS’ kernel code was slightly modified, and two kernel modules and a user-level daemon were added. In the prototype, fine grain locking is fully implemented and a cluster-wide coherent cache is maintained through data (page fragments) movement over the LAN. Our benchmarks for non-overlapping writers over a single file shared among processes running on different nodes show that pCFS’ bandwidth is 2 times greater than NFS’ while being comparable to that of the Parallel Virtual File System (PVFS), both requiring about 10 times more CPU. And pCFS’ bandwidth also surpasses GFS’ (600 times for small record sizes, e.g., 4 KB, decreasing down to 2 times for large record sizes, e.g., 4 MB), at about the same CPU usage.Lusitania, Companhia de Seguros S.A, Programa IBM Shared University Research (SUR

    An Analysis for Evaluating the Cost/Profit Effectiveness of Parallel Systems

    Get PDF
    A new domain of commercial applications demands the development of inexpensive parallel computing platforms to lower the cost of operations and increase the business profit. The calculation of returns on an IT investment is now important to justify the decision of upgrading or replacing parallel systems. This thesis presents a framework of the performance and economic factors that are considered when evaluating a parallel system. We introduce a metric called the cost/profit effective metric, which measures the effectiveness of a parallel system in terms of performance, cost and profit. This metric describes the profit obtained from the performance of three different domains for scaling: speed-up, throughput and/or scale-up. Cost is measured by the actual costs of a parallel system. We present two cases of study to demonstrate the application of this metric and analyze the results to support the evaluation of the parallel system on each case

    Fourth NASA Goddard Conference on Mass Storage Systems and Technologies

    Get PDF
    This report contains copies of all those technical papers received in time for publication just prior to the Fourth Goddard Conference on Mass Storage and Technologies, held March 28-30, 1995, at the University of Maryland, University College Conference Center, in College Park, Maryland. This series of conferences continues to serve as a unique medium for the exchange of information on topics relating to the ingestion and management of substantial amounts of data and the attendant problems involved. This year's discussion topics include new storage technology, stability of recorded media, performance studies, storage system solutions, the National Information infrastructure (Infobahn), the future for storage technology, and lessons learned from various projects. There also will be an update on the IEEE Mass Storage System Reference Model Version 5, on which the final vote was taken in July 1994

    Augmenting data warehousing architectures with hadoop

    Get PDF
    Dissertation presented as the partial requirement for obtaining a Master's degree in Information Management, specialization in Information Systems and Technologies ManagementAs the volume of available data increases exponentially, traditional data warehouses struggle to transform this data into actionable knowledge. Data strategies that include the creation and maintenance of data warehouses have a lot to gain by incorporating technologies from the Big Data’s spectrum. Hadoop, as a transformation tool, can add a theoretical infinite dimension of data processing, feeding transformed information into traditional data warehouses that ultimately will retain their value as central components in organizations’ decision support systems. This study explores the potentialities of Hadoop as a data transformation tool in the setting of a traditional data warehouse environment. Hadoop’s execution model, which is oriented for distributed parallel processing, offers great capabilities when the amounts of data to be processed require the infrastructure to expand. Horizontal scalability, which is a key aspect in a Hadoop cluster, will allow for proportional growth in processing power as the volume of data increases. Through the use of a Hive on Tez, in a Hadoop cluster, this study transforms television viewing events, extracted from Ericsson’s Mediaroom Internet Protocol Television infrastructure, into pertinent audience metrics, like Rating, Reach and Share. These measurements are then made available in a traditional data warehouse, supported by a traditional Relational Database Management System, where they are presented through a set of reports. The main contribution of this research is a proposed augmented data warehouse architecture where the traditional ETL layer is replaced by a Hadoop cluster, running Hive on Tez, with the purpose of performing the heaviest transformations that convert raw data into actionable information. Through a typification of the SQL statements, responsible for the data transformation processes, we were able to understand that Hadoop, and its distributed processing model, delivers outstanding performance results associated with the analytical layer, namely in the aggregation of large data sets. Ultimately, we demonstrate, empirically, the performance gains that can be extracted from Hadoop, in comparison to an RDBMS, regarding speed, storage usage and scalability potential, and suggest how this can be used to evolve data warehouses into the age of Big Data

    Space station data system analysis/architecture study. Task 2: Options development DR-5. Volume 1: Technology options

    Get PDF
    The second task in the Space Station Data System (SSDS) Analysis/Architecture Study is the development of an information base that will support the conduct of trade studies and provide sufficient data to make key design/programmatic decisions. This volume identifies the preferred options in the technology category and characterizes these options with respect to performance attributes, constraints, cost, and risk. The technology category includes advanced materials, processes, and techniques that can be used to enhance the implementation of SSDS design structures. The specific areas discussed are mass storage, including space and round on-line storage and off-line storage; man/machine interface; data processing hardware, including flight computers and advanced/fault tolerant computer architectures; and software, including data compression algorithms, on-board high level languages, and software tools. Also discussed are artificial intelligence applications and hard-wire communications
    corecore