632,341 research outputs found

    LogBase: A Scalable Log-structured Database System in the Cloud

    Full text link
    Numerous applications such as financial transactions (e.g., stock trading) are write-heavy in nature. The shift from reads to writes in web applications has also been accelerating in recent years. Write-ahead-logging is a common approach for providing recovery capability while improving performance in most storage systems. However, the separation of log and application data incurs write overheads observed in write-heavy environments and hence adversely affects the write throughput and recovery time in the system. In this paper, we introduce LogBase - a scalable log-structured database system that adopts log-only storage for removing the write bottleneck and supporting fast system recovery. LogBase is designed to be dynamically deployed on commodity clusters to take advantage of elastic scaling property of cloud environments. LogBase provides in-memory multiversion indexes for supporting efficient access to data maintained in the log. LogBase also supports transactions that bundle read and write operations spanning across multiple records. We implemented the proposed system and compared it with HBase and a disk-based log-structured record-oriented system modeled after RAMCloud. The experimental results show that LogBase is able to provide sustained write throughput, efficient data access out of the cache, and effective system recovery.Comment: VLDB201

    Determinación de factores influyentes sobre una respuesta en un dominio poco estructurado

    Get PDF
    This report focuses on results obtained from a classification technique applied to time series data in a medical ill-structured The statistical analysis and classification --in ill-structured-- of such data are often inadequate because of the intrinsic characteristics of those domains. The database in this analysis contains information relative to patients with major depressive disorders or esquizofrenia; as a consequence, a high quantity of database variables contain data corresponding to measures taken in different instant of time, making curves. For this reason we are motivated about how we can establish a useful classification technique of curves in a medical ill-structured domain.Postprint (published version

    The MultiDark Database: Release of the Bolshoi and MultiDark Cosmological Simulations

    Full text link
    We present the online MultiDark Database -- a Virtual Observatory-oriented, relational database for hosting various cosmological simulations. The data is accessible via an SQL (Structured Query Language) query interface, which also allows users to directly pose scientific questions, as shown in a number of examples in this paper. Further examples for the usage of the database are given in its extensive online documentation (www.multidark.org). The database is based on the same technology as the Millennium Database, a fact that will greatly facilitate the usage of both suites of cosmological simulations. The first release of the MultiDark Database hosts two 8.6 billion particle cosmological N-body simulations: the Bolshoi (250/h Mpc simulation box, 1/h kpc resolution) and MultiDark Run1 simulation (MDR1, or BigBolshoi, 1000/h Mpc simulation box, 7/h kpc resolution). The extraction methods for halos/subhalos from the raw simulation data, and how this data is structured in the database are explained in this paper. With the first data release, users get full access to halo/subhalo catalogs, various profiles of the halos at redshifts z=0-15, and raw dark matter data for one time-step of the Bolshoi and four time-steps of the MultiDark simulation. Later releases will also include galaxy mock catalogs and additional merging trees for both simulations as well as new large volume simulations with high resolution. This project is further proof of the viability to store and present complex data using relational database technology. We encourage other simulators to publish their results in a similar manner.Comment: 28 pages, 9 figures, submitted to New Astronom

    Path constraints in semistructured databases

    Get PDF
    AbstractWe investigate a class of path constraints that is of interest in connection with both semistructured and structured data. In standard database systems, constraints are typically expressed as part of the schema, but in semistructured data there is no explicit schema and path constraints provide a natural alternative. As with structured data, path constraints on semistructured data express integrity constraints associated with the semantics of data and are important in query optimization. We show that in semistructured databases, despite the simple syntax of the constraints, their associated implication problem is r.e. complete and finite implication problem is co-r.e. complete. However, we establish the decidability of the implication and finite implication problems for several fragments of the path constraint language and demonstrate that these fragments suffice to express important semantic information such as extent constraints, inverse relationships, and local database constraints commonly found in object-oriented databases
    • …
    corecore