120 research outputs found
Adding HL7 version 3 data types to PostgreSQL
The HL7 standard is widely used to exchange medical information
electronically. As a part of the standard, HL7 defines scalar communication
data types like physical quantity, point in time and concept descriptor but
also complex types such as interval types, collection types and probabilistic
types. Typical HL7 applications will store their communications in a database,
resulting in a translation from HL7 concepts and types into database types.
Since the data types were not designed to be implemented in a relational
database server, this transition is cumbersome and fraught with programmer
error. The purpose of this paper is two fold. First we analyze the HL7 version
3 data type definitions and define a number of conditions that must be met, for
the data type to be suitable for implementation in a relational database. As a
result of this analysis we describe a number of possible improvements in the
HL7 specification. Second we describe an implementation in the PostgreSQL
database server and show that the database server can effectively execute
scientific calculations with units of measure, supports a large number of
operations on time points and intervals, and can perform operations that are
akin to a medical terminology server. Experiments on synthetic data show that
the user defined types perform better than an implementation that uses only
standard data types from the database server.Comment: 12 pages, 9 figures, 6 table
Schema Vacuuming in Temporal Databases
Temporal databases facilitate the support of historical information by providing functions for indicating the intervals during which a tuple was applicable (along one or more temporal dimensions). Because data are never deleted, only superceded, temporal databases are inherently append-only resulting, over time, in a large historical sequence of database states. Data vacuuming in temporal databases allows for this sequence to be shortened by strategically, and irrevocably, deleting obsolete data. Schema versioning allows users to maintain a history of database schemata without compromising the semantics of the data or the ability to view data through historical schemata. While the techniques required for data vacuuming in temporal databases have been relatively well covered, the associated area of vacuuming schemata has received less attention. This paper discusses this issue and proposes a mechanism that fits well with existing methods for data vacuuming and schema versioning
LogBase: A Scalable Log-structured Database System in the Cloud
Numerous applications such as financial transactions (e.g., stock trading)
are write-heavy in nature. The shift from reads to writes in web applications
has also been accelerating in recent years. Write-ahead-logging is a common
approach for providing recovery capability while improving performance in most
storage systems. However, the separation of log and application data incurs
write overheads observed in write-heavy environments and hence adversely
affects the write throughput and recovery time in the system. In this paper, we
introduce LogBase - a scalable log-structured database system that adopts
log-only storage for removing the write bottleneck and supporting fast system
recovery. LogBase is designed to be dynamically deployed on commodity clusters
to take advantage of elastic scaling property of cloud environments. LogBase
provides in-memory multiversion indexes for supporting efficient access to data
maintained in the log. LogBase also supports transactions that bundle read and
write operations spanning across multiple records. We implemented the proposed
system and compared it with HBase and a disk-based log-structured
record-oriented system modeled after RAMCloud. The experimental results show
that LogBase is able to provide sustained write throughput, efficient data
access out of the cache, and effective system recovery.Comment: VLDB201
Snapshot isolation for transactional stream processing
Transactional database systems and data stream management systems have been thoroughly investigated over the past decades. While both systems follow completely different data processing models, the combined concept of transactional stream processing promises to be the future data processing model. So far, however, it has not been investigated how well-known concepts found in DBMS or DSMS regarding multi-user support can be transferred to this model or how they need to be redesigned. In this paper, we propose a transaction model combining streaming and stored data as well as continuous and ad-hoc queries. Based on this, we present appropriate protocols for concurrency control of such queries guaranteeing snapshot isolation as well as for consistency of transactions comprising several shared states. In our evaluation, we show that our protocols represent a resilient
and scalable solution meeting all requirements for such a model
- âŠ