654,484 research outputs found

    Oracle Database 10g: a platform for BLAST search and Regular Expression pattern matching in life sciences

    Get PDF
    As database management systems expand their array of analytical functionality, they become powerful research engines for biomedical data analysis and drug discovery. Databases can hold most of the data types commonly required in life sciences and consequently can be used as flexible platforms for the implementation of knowledgebases. Performing data analysis in the database simplifies data management by minimizing the movement of data from disks to memory, allowing pre-filtering and post-processing of datasets, and enabling data to remain in a secure, highly available environment. This article describes the Oracle Database 10g implementation of BLAST and Regular Expression Searches and provides case studies of their usage in bioinformatics. http://www.oracle.com/technology/software/index.htm

    Decision support method for the selection of OMSs

    Get PDF
    With the increasing demand for highly complex, integrated and application-domain-specific systems engineering environments (SEEs) more or less specialized components of the SEEs are developed. An important component is the database management system (DBMS). As conventional DBMSs are not useful to fulfill the requirements on highly complex, persistent data structures, specialized DBMSs, namely object management systems (OMS), have been developed. An advantage of OMSs is that they further enhance the integration not only of data but also of processes. Currently several specialized OMSs with significantly different properties such as the data model, architecture and performance are available. As it is very difficult for an SEE developer to select the most appropriate OMS, we propose a decision support method which enables an SEE developer to identify his requirements and to compare the evaluation results of different OMSs. Additionally we present a practical experiment where we have applied the decision support method for comparing different OMSs. Experiences of the investigation are presented briefly

    Scalable and Highly Available Database Systems in the Cloud

    Get PDF
    Cloud computing allows users to tap into a massive pool of shared computing resources such as servers, storage, and network. These resources are provided as a service to the users allowing them to “plug into the cloud” similar to a utility grid. The promise of the cloud is to free users from the tedious and often complex task of managing and provisioning computing resources to run applications. At the same time, the cloud brings several additional benefits including: a pay-as-you-go cost model, easier deployment of applications, elastic scalability, high availability, and a more robust and secure infrastructure. One important class of applications that users are increasingly deploying in the cloud is database management systems. Database management systems differ from other types of applications in that they manage large amounts of state that is frequently updated, and that must be kept consistent at all scales and in the presence of failure. This makes it difficult to provide scalability and high availability for database systems in the cloud. In this thesis, we show how we can exploit cloud technologies and relational database systems to provide a highly available and scalable database service in the cloud. The first part of the thesis presents RemusDB, a reliable, cost-effective high availability solution that is implemented as a service provided by the virtualization platform. RemusDB can make any database system highly available with little or no code modifications by exploiting the capabilities of virtualization. In the second part of the thesis, we present two systems that aim to provide elastic scalability for database systems in the cloud using two very different approaches. The three systems presented in this thesis bring us closer to the goal of building a scalable and reliable transactional database service in the cloud

    Scalable transactions in the cloud: partitioning revisited

    Get PDF
    Lecture Notes in Computer Science, 6427Cloud computing is becoming one of the most used paradigms to deploy highly available and scalable systems. These systems usually demand the management of huge amounts of data, which cannot be solved with traditional nor replicated database systems as we know them. Recent solutions store data in special key-value structures, in an approach that commonly lacks the consistency provided by transactional guarantees, as it is traded for high scalability and availability. In order to ensure consistent access to the information, the use of transactions is required. However, it is well-known that traditional replication protocols do not scale well for a cloud environment. Here we take a look at current proposals to deploy transactional systems in the cloud and we propose a new system aiming at being a step forward in achieving this goal. We proceed to focus on data partitioning and describe the key role it plays in achieving high scalability.This work has been partially supported by the Spanish Government under grant TIN2009-14460-C03-02 and by the Spanish MEC under grant BES-2007-17362 and by project ReD Resilient Database Clusters (PDTC/EIA-EIA/109044/2008)

    Personal digital assistants: Essential tools for preparing dietetics professionals to use new generation information technology

    Get PDF
    Rapid integration of information technology into health care systems has included the use of highly portable systems-in particular, personal digital assistants (PDAs). With their large built-in memories, fast processors, wireless connectivity, multimedia capacity, and large library of applications, PDAs have been widely adopted by physicians and nurses for patient tracking, disease management, medical references and drug information, enhancing a quality of health care. Many health-related PDA applications are available to both dietetics professionals and clients. Dietetics professionals can effectively use PDAs for client tracking and support, accessing to hospital database or information, and providing better self-monitoring tools to clients. Internship programs for dietetics professionals should include training in the use of PDAs and their dietetics applications, so that new practitioners can stay abreast of this rapidly evolving technology. Several considerations to keep in mind in selecting a PDA and its applications are discussed

    Evaluating data freshness in large scale replicated databases

    Get PDF
    There is nowadays an increasing need for database replication, as the construction of high performance, highly available, and large-scale applications depends on it to maintain data synchronized across multiple servers. A particularly popular approach, used for instance byFacebook, is the MySQL open source database management system and its built-in asynchronous replication mechanism. The limitations imposed by MySQL on replication topologies mean that data has to go through a number of hops or each server has to handle a large number of slaves. This is particularly worrisome when updates are accepted by multiple replicas and in large systems. It is however difficult to accurately evaluate the impact of replication in data freshness, since one has to compare observations at multiple servers while running a realistic workload and without disturbing the system under test. In this paper we address this problem by introducing a tool that can accurately measure replication delays for any workload and then apply it to the industry standard TPC-C benchmark. This allows us to draw interesting conclusions about the scalability properties of MySQL replication

    LST-Bench: Benchmarking Log-Structured Tables in the Cloud

    Full text link
    Log-Structured Tables (LSTs), also commonly referred to as table formats, have recently emerged to bring consistency and isolation to object stores. With the separation of compute and storage, object stores have become the go-to for highly scalable and durable storage. However, this comes with its own set of challenges, such as the lack of recovery and concurrency management that traditional database management systems provide. This is where LSTs such as Delta Lake, Apache Iceberg, and Apache Hudi come into play, providing an automatic metadata layer that manages tables defined over object stores, effectively addressing these challenges. A paradigm shift in the design of these systems necessitates the updating of evaluation methodologies. In this paper, we examine the characteristics of LSTs and propose extensions to existing benchmarks, including workload patterns and metrics, to accurately capture their performance. We introduce our framework, LST-Bench, which enables users to execute benchmarks tailored for the evaluation of LSTs. Our evaluation demonstrates how these benchmarks can be utilized to evaluate the performance, efficiency, and stability of LSTs. The code for LST-Bench is open sourced and is available at https://github.com/microsoft/lst-bench/

    A generalized system performance model for object-oriented database applications

    Get PDF
    Although relational database systems have met many needs in traditional business applications, such technology is inadequate for non-traditional applications such as computer-aided design, computer-aided software engineering, and knowledge bases. Object-oriented database systems (OODB) enhance the data modeling power and performance of database management systems for these applications. Response time is an important issue facing OODB. However, standard measures of on-line transaction processing are irrelevant for OODB . Benchmarks compare alternative implementations of OODB system software, running a constant application workload. Few attempts have been made to characterize performance implications of OODB application design, given a fixed OODB and operating system platform. In this study, design features of the 007 Benchmark database application (Carey, DeWitt, and Naughton, 1993 ) were varied to explore the impact on response time to perform database operations Sensitivity to the degree of aggregation and to the degree of inheritance in the application were measured. Variability in response times also was measured, using a sequence of database operations to simulate a user transaction workload. Degree of aggregation was defined as the number of relationship objects processed during a database operation. Response time was linear with the degree of aggregation. The size of the database segment processed, compared to the size of available memory, affected the coefficients of the regression line. Degree of inheritance was defined as the Number of Children (Chidamber and Kemerer, 1994) in the application class definitions, and as the extent to which run-time polymorphism was implemented. In this study, increased inheritance caused a statistically significant increase in response time for the 007 Traversal 1 only, although this difference was not meaningful. In the simulated transaction workload of nine 007 operations, response times were highly variable. Response times per operation depended on the number of objects processed and the effect of preceding operations on memory contents. Operations that used disparate physical segments or had large working sets relative to the size of memory caused large increases in response time. Average response times and variability were reduced by removing these operations from the sequence (equivalent to scheduling these transactions at some time when the impact would be minimized)

    Ivar, an interpretation‐oriented tool to manage the update and revision of variant annotation and classification

    Get PDF
    The rapid evolution of Next Generation Sequencing in clinical settings, and the resulting challenge of variant reinterpretation given the constantly updated information, require robust data management systems and organized approaches. In this paper, we present iVar: a freely available and highly customizable tool with a user‐friendly web interface. It represents a platform for the unified management of variants identified by different sequencing technologies. iVar accepts variant call format (VCF) files and text annotation files and elaborates them, optimizing data organization and avoiding redundancies. Updated annotations can be periodically re‐uploaded and associated with variants as historically tracked attributes, i.e., modifications can be recorded whenever an updated value is imported, thus keeping track of all changes. Data can be visualized through variant‐centered and sample‐centered interfaces. A customizable search function can be exploited to periodically check if pathogenicity‐related data of a variant has changed over time. Patient recontacting ensuing from variant reinterpretation is made easier by iVar through the effective identification of all patients present in the database carrying a specific variant. We tested iVar by uploading 4171 VCF files and 1463 annotation files, obtaining a database of 4166 samples and 22,569 unique variants. iVar has proven to be a useful tool with good performance in terms of collecting and managing data from a medium‐throughput laboratory
    corecore